Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems arising after S4TF V0.3 -> v0.4 #8

Closed
F20170604 opened this issue Jun 30, 2019 · 7 comments
Closed

Problems arising after S4TF V0.3 -> v0.4 #8

F20170604 opened this issue Jun 30, 2019 · 7 comments
Assignees

Comments

@F20170604
Copy link
Owner

F20170604 commented Jun 30, 2019

Tutorial 3 Colab Link : https://colab.research.google.com/drive/1XfT75glapWdCp0Zj_4mkiMrjT6odD-Gc
Tutorial 5 Colab Link : https://colab.research.google.com/drive/13lBsht3Wa4GjKKkA47JCrd54XikhNX2E

I am facing Out Of Memory(OOM) while allocating tensor during training of classifier in Tutorial 5. I've tried 12 times and all of the time, it's happening after Epoch 4 out of 100.
Further, I couldn't get the packages to install. The error I am getting is 'toolchain is invalid'
It all started happening after S4TF update to v0.4

Following are the screenshots of the issue regarding OOM
Screenshot 2019-06-30 at 5 35 55 PM
Screenshot 2019-06-30 at 5 35 28 PM
Screenshot regarding Package Install
Screenshot 2019-06-30 at 10 54 29 PM

@F20170604 F20170604 changed the title Memory Leak during training - Tutorial 5 Problems arising after S4TF V0.3 -> v0.4 Jun 30, 2019
@F20170604
Copy link
Owner Author

OOM gets resolved after modifying the callAsFunction
I changed it to :

public func callAsFunction(_ input: Input) -> Output {
        var convolved1 = pool1(conv1a(input))
        var convolved2 = pool2(conv1b(convolved1))
        var convolved3 = pool3(conv1c(convolved2))
        var convolved4 = pool4(conv1d(convolved3))
        return layer1b(layer1a(flatten(convolved4)))
        
    }

@F20170604
Copy link
Owner Author

F20170604 commented Jun 30, 2019

Package installation is still giving errors. I looked through fast.ai new notebooks and exactly copied the import commands. But I am still getting the same error.

@RahulBhalley
Copy link

Any updates on memory leak @rxwei? I am [also] still experiencing this issue in my own notebook where I was previously able to train my GAN successfully in v0.3 but now in v0.4 it just shows the error of "memory limits reached" as shown in this issue above.

Or @Ayush517 did you find some way to fix this issue in your own code?

@F20170604
Copy link
Owner Author

@RahulBhalley Earlier my network was this :

func call(_ input: Input) -> Output {
        var convolved1 = input.sequenced(through: conv1a, pool1)
        var convolved2 = convolved1.sequenced(through: conv1b, pool2)
        var convolved3 = convolved2.sequenced(through: conv1c, pool3)
        var convolved4 = convolved3.sequenced(through: conv1d, pool4)
        return convolved4.sequenced(through: flatten, layer1a, layer1b)
        
    }

After I changed to the following code, It started working fine

public func callAsFunction(_ input: Input) -> Output {
        var convolved1 = pool1(conv1a(input))
        var convolved2 = pool2(conv1b(convolved1))
        var convolved3 = pool3(conv1c(convolved2))
        var convolved4 = pool4(conv1d(convolved3))
        return layer1b(layer1a(flatten(convolved4)))
        
    }

@RahulBhalley
Copy link

RahulBhalley commented Jul 8, 2019

Let's check. Fingers crossed🤞. I hope this be the only problem with sequenced method.

Edit: Nope, crashed again @rxwei. Please help me resolve this issue. ☹️

@rxwei
Copy link
Collaborator

rxwei commented Jul 8, 2019

Hi @RahulBhalley , sorry for the delay. Could you try to build a toolchain against the latest HEAD of the tensorflow branch? swiftlang/swift#25967 fixed the major issue and it could resolve your OOM problem. We'll release another 0.4 release candidate later today.

@RahulBhalley
Copy link

RahulBhalley commented Jul 8, 2019

Sorry, I can't build because I refrain from heating up my MacBook Pro (recently got its battery replaced). I'll wait for newer 0.4 release to go live today on Google Colab (if I get you correctly 😃). Thanks for replying btw @rxwei.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants