Recently I implemented a VGG-16 network using both Tensorflow and PyTorch, data set is CIFAR-10. Each picture is 32 * 32 RGB.
I use a 64 batch size in beginning, while I found PyTorch using much less GPU memory than tensorflow. Then I did some experiments and got a figure, which is posted below.
After some researching, I known the tensorflow using BFC algorithm to manage memory. So it's can explain why tensorflow's memory using decreasing or increasing by 2048, 1024, ... MB and sometimes the memory use not increasing when batch size is bigger.
But I am still confused, why the memory use is lower when batch size is 512 than batch size is 384, 448 etc. which has a smaller batch size. The same as when batch size is from 1024 to 1408, and batch size is 2048 to 2688.
Here is my source code:
edit: I have two Titan XP on my computer, OS: Linux Mint 18.2 64-bit.
I determine GPU memory usage with command .
My code runs on GPU1, which is defined in my code:
And I am sure there only one application using GPU1.
GPU memory usage can be determined by the application list below. For example, like the posted screen shot below, process name is and its GPU memory usage is 1563 MiB.
- Where do I go in Southeast Asia
- Who opened the first international airport
- What was the first pop song released
- Why does Anonymous hate the Clintons
- Why shouldnt you become a doctor
- Why is America so unequal
- Are Swedish people more polite than Germans
- Is 20 mg of Percocet harmful
- Why is wearing all black considered sophisticated
- Does everyones eyes change colors
- Why do I seek validation
- Why do visual effects cost so much
- Why does shopping relieve stress
- How does cosmology relate to geology
- Why would someone convert to Anglicanism