How to reduce/free memory when using xarray datasets? - python-3.x

This is the output of the memory profiler of a function in my code, using xarray (v.0.16.1) datasets:
Line # Mem usage Increment Line Contents
================================================
139 94.195 MiB 94.195 MiB #profile
140 def getMaps(ncfile):
141 335.914 MiB 241.719 MiB myCMEMSdata = xr.open_dataset(ncfile).resample(time='3H').reduce(np.mean)
142
143 335.945 MiB 0.031 MiB plt.figure(figsize=(20.48, 10.24))
144
145 # projection, lat/lon extents and resolution of polygons to draw
146 # resolutions: c - crude, l - low, i - intermediate, h - high, f - full
147 336.809 MiB 0.863 MiB map = Basemap(projection='merc', llcrnrlon=-10.,
148 335.945 MiB 0.000 MiB llcrnrlat=30., urcrnrlon=36.5, urcrnrlat=46.)
149
150
151 339.773 MiB 2.965 MiB X, Y = np.meshgrid(myCMEMSdata.longitude.values,
152 336.809 MiB 0.000 MiB myCMEMSdata.latitude.values)
153 348.023 MiB 8.250 MiB x, y = map(X, Y)
154
155 # reduce arrows density (1 out of 15)
156 348.023 MiB 0.000 MiB yy = np.arange(0, y.shape[0], 15)
157 348.023 MiB 0.000 MiB xx = np.arange(0, x.shape[1], 15)
158 348.023 MiB 0.000 MiB points = np.meshgrid(yy,xx)
159
160 #cycle time to save maps
161 348.023 MiB 0.000 MiB i=0
162 742.566 MiB 0.000 MiB while i < myCMEMSdata.time.values.size:
163 742.566 MiB 305.996 MiB map.shadedrelief(scale=0.65)
164 #waves height
165 742.566 MiB 0.000 MiB waveH = myCMEMSdata.VHM0.values[i, :, :]
166 742.566 MiB 0.000 MiB my_cmap = plt.get_cmap('rainbow')
167 742.566 MiB 0.043 MiB map.pcolormesh(x, y, waveH, cmap=my_cmap, norm=matplotlib.colors.LogNorm(vmin=0.07, vmax=4.,clip=True))
168 # waves direction
169 742.566 MiB 0.000 MiB wDir = myCMEMSdata.VMDR.values[i, :, :]
170 742.566 MiB 0.242 MiB map.quiver(x[tuple(points)],y[tuple(points)],np.cos(np.deg2rad(270-wDir[tuple(points)])),np.sin(np.deg2rad(270-wDir[tuple(points)])),
171 742.566 MiB 0.000 MiB edgecolor='lightgray', minshaft=4, width=0.007, headwidth=3., headlength=4., linewidth=.5)
172 # save plot
173 742.566 MiB 0.000 MiB filename = pd.to_datetime(myCMEMSdata.time[i].values).strftime("%Y-%m-%d_%H")
174 742.566 MiB 0.086 MiB plt.show()
175 742.566 MiB 39.406 MiB plt.savefig(TEMPDIR+filename+".jpg", quality=75)
176 742.566 MiB 0.000 MiB plt.clf()
177 742.566 MiB 0.000 MiB del wDir
178 742.566 MiB 0.000 MiB del waveH
179 742.566 MiB 0.000 MiB i += 1
180
181 #out of loop
182 581.840 MiB 0.000 MiB plt.close("all")
183 581.840 MiB 0.000 MiB myCMEMSdata.close()
184 441.961 MiB 0.000 MiB del myCMEMSdata
As you can see the allocated memory is not freed up, and after many runs of the program, it simply fails ("Killed") for low memory.
How can I free the memory allocated by the dataset? I am using either dataset.close() and deleting the variable, with no success.

Related

Memory leak where CPython extension returns a 'PyList_New' instance to Python, which is never deallocated

I've been trying to debug a memory leak for a few days and I'm running out of ideas.
High-level: I've written a CPython extension that allows querying against binary data files, and it returns the results as a Python list of objects. Usage is similar to this psuedocode:
for config in configurations:
s = Strategy(config)
for date in alldates:
data = extension.getData(date)
# do analysis on 'data', capture/save statistics
I've used tracemalloc, memory_profiler, objgraph, sys.getrefcount, and gc.get_referrers to try to find the root cause, and these tools all point to this extension as source of an exorbitant amount of memory (many gigs). For context, a single record in the binary file is 64 bytes, there are typically 390 records per day, so each date iteration is working with ~24K bytes. Now, there are many iterations happening (synchronously), but in each iteration the data is used as a local variable, so I expected each subsequent assignment to deallocate the previous object. The results from memory_profile suggest otherwise...
Line # Mem usage Increment Occurences Line Contents
============================================================
86 33.7 MiB 33.7 MiB 1 #profile
87 def evaluate(self, date: int, filterConfidence: bool, limitToMaxPositions: bool, verbose: bool) -> None:
92 112.7 MiB 0.0 MiB 101 for symbol in self.symbols:
93 111.7 MiB 0.0 MiB 100 fromdate: int = TradingDays.getAdjacentDay(date, -(self.config.analysisPeriod - 1))
94 111.7 MiB 0.0 MiB 100 throughdate: int = date
95
96 111.7 MiB 0.0 MiB 100 maxtime: int = self.config.maxTimeToGain
97 111.7 MiB 0.0 MiB 100 target: float = self.config.profitTarget
98 111.7 MiB 0.0 MiB 100 islong: bool = self.config.isLongStrategy
99
100 111.7 MiB 0.8 MiB 100 avgtime: Optional[int] = FileStore.getAverageTime(symbol, maxtime, target, islong, fromdate, throughdate, verbose)
101 111.7 MiB 0.0 MiB 100 if avgtime is None:
102 110.7 MiB 0.0 MiB 11 continue
103
104 112.7 MiB 78.3 MiB 89 weightedModel: WeightedModel = self.testAverageTimes(symbol, avgtime, fromdate, throughdate)
105 112.7 MiB 0.0 MiB 89 if weightedModel is not None:
106 112.7 MiB 0.0 MiB 88 self.watchlist.append(weightedModel)
107 112.7 MiB 0.0 MiB 88 self.averageTimes[symbol] = avgtime
108
109 112.7 MiB 0.0 MiB 1 if verbose:
110 print('\nFull Evaluation Results')
111 print(self.getWatchlistTableString())
112
113 112.7 MiB 0.0 MiB 1 self.watchlist.sort(key=WeightedModel.sortKey, reverse=True)
114
115 112.7 MiB 0.0 MiB 1 if filterConfidence:
116 112.7 MiB 0.0 MiB 91 self.watchlist = [ m for m in self.watchlist if m.getConfidence() >= self.config.winRate ]
117
118 112.7 MiB 0.0 MiB 1 if limitToMaxPositions:
119 self.watchlist = self.watchlist[:self.config.maxPositions]
120
121 112.7 MiB 0.0 MiB 1 return
This is from the first iteration of the evaluate function (there are 30 iterations total). Line 104 is where it seems to be accumulating memory. What's strange is that the weightedModel contains only basic stats about the data queried, and that data is stored in a loop-local variable. I can't figure out why the memory used is not cleaned up after each inner iteration.
I've tried to del the objects in question after an iteration completes, but it has no effect. The refcount does seem high for the containing objects, and gc.get_referrers shows an object as referring to itself (?).
I'm happy to provide additional information/code, but I've tried so many things at this point a braindump would be a complete mess :) I'm hoping someone with more experience might be able to help me focus my thought process.
Cheers!
Found it! The leak was one layer deeper, where the extension function builds an instance of a Python object.
This was the leaky version:
PyObject* obj = PyObject_CallObject(PRICEBAR_CLASS_DEF, args);
PyObject_SetAttrString(obj, "id", PyLong_FromLong(bar->id));
# a bunch of other attrs...
return obj;
This is the fixed version:
PyObject* obj = PyObject_CallObject(PRICEBAR_CLASS_DEF, args);
PyObject* id = PyLong_FromLong(bar->id);
# others...
PyObject_SetAttrString(obj, "id", id);
# others...
Py_DECREF(id);
# others...
return obj;
For some reason I had it in my head that the PyLong_FromLong function did NOT increment the ref count of the resulting object, but this is apparently not true. This is how I wound up with an extra reference count for every bar object that was created.

Memory usage of List vs Generator seems almost same. why?

I have the following python code. I am trying to understand python generators. If my understanding is correct the print_list will take much more memory than print_generator. I am using memory profiler to profile the two functions below.
from memory_profiler import profile
import logging
my_list = [i for i in range(100000)]
my_generator = (i for i in range(1000000))
#profile
def print_generator():
try:
while True:
item = next(my_generator)
logging.info(item)
except StopIteration:
pass
finally:
print('Printed all elements')
#profile
def print_list():
for item in my_list:
logging.info(item)
pass
logging.basicConfig(filename='app.log', filemode='w', format='%(name)s - %(levelname)s - %(message)s')
print_list()
print_generator()
The result of the profiling is pasted below.
Memory usage for the generator.
Line # Mem usage Increment Occurences Line Contents
============================================================
10 23.0 MiB 23.0 MiB 1 #profile
11 def print_generator():
12 23.0 MiB 0.0 MiB 1 try:
13 23.0 MiB 0.0 MiB 1 while True:
14 23.0 MiB -26026.5 MiB 1000001 item = next(my_generator)
15 23.0 MiB -26026.5 MiB 1000000 logging.info(item)
16 23.0 MiB -0.1 MiB 1 except StopIteration:
17 23.0 MiB 0.0 MiB 1 pass
18 finally:
19 23.0 MiB 0.0 MiB 1 print('Printed all elements')
Memory usage for the list
Line # Mem usage Increment Occurences Line Contents
============================================================
22 23.0 MiB 23.0 MiB 1 #profile
23 def print_list():
24 23.0 MiB 0.0 MiB 100001 for item in my_list:
25 23.0 MiB 0.0 MiB 100000 logging.info(item)
26 23.0 MiB 0.0 MiB 100000 pass
The memory usage for the list and the generator seems almost idendical.
So what am I missing here?. Why is generator using less memory?

Too much memory used when dealing with BeautifulSoup4 in Python 3

I wrote a script which is fetching the HTML content of a page and analyze it.
Running this code in a loop, where I gave several URLs, I notice the memory usage was growing too much and too quickly.
Profiling and debugging the code with several tools, I notice the problem comes from the bit of code that is using BeautifulSoup4, at least I think.
Line Mem usage Increment Line Contents
59 40.5 MiB 40.5 MiB #profile
60 def crawl(self):
70 40.6 MiB 0.0 MiB self.add_url_to_crawl(self.base_url)
71
72 291.8 MiB 0.0 MiB for url in self.page_queue:
74 267.4 MiB 0.0 MiB if url in self.crawled_urls:
75 continue
76
77 267.4 MiB 0.0 MiB page = Page(url=url, base_domain=self.base_url, log=self.log)
78
79 267.4 MiB 0.0 MiB if page.parsed_url.netloc != page.base_domain.netloc:
80 continue
81
82 291.8 MiB 40.1 MiB page.analyze()
83
84 291.8 MiB 0.0 MiB self.content_hashes[page.content_hash].add(page.url)
94
95 # Add crawled page links into the queue
96 291.8 MiB 0.0 MiB for url in page.links:
97 291.8 MiB 0.0 MiB self.add_url_to_crawl(url)
98
100 291.8 MiB 0.0 MiB self.crawled_pages.append(page.getData())
101 291.8 MiB 0.0 MiB self.crawled_urls.add(page.url)
102
103 291.8 MiB 0.0 MiB mem = resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
104 291.8 MiB 0.0 MiB print('Memory usage is: {0} KB'.format(mem))
Here is what line 104 is printing each run:
Memory usage is: 69216 KB
Memory usage is: 92092 KB
Memory usage is: 105796 KB
Memory usage is: 134704 KB
Memory usage is: 158604 KB
Memory usage is: 184068 KB
Memory usage is: 225324 KB
Memory usage is: 248708 KB
Memory usage is: 273780 KB
Memory usage is: 298768 KB
Using tracemalloc in the main file which is calling all the modules and running the crawl method above, I got the following list from tracemalloc.snapshot():
/usr/lib/python3.8/site-packages/bs4/element.py:744: size=23.3 MiB, count=210391, average=116 B
/usr/lib/python3.8/site-packages/bs4/builder/__init__.py:215: size=17.3 MiB, count=335036, average=54 B
/usr/lib/python3.8/site-packages/bs4/element.py:628: size=9028 KiB, count=132476, average=70 B
/usr/lib/python3.8/html/parser.py:327: size=7804 KiB, count=147140, average=54 B
/usr/lib/python3.8/site-packages/bs4/element.py:121: size=6727 KiB, count=132476, average=52 B
/usr/lib/python3.8/site-packages/bs4/element.py:117: size=6702 KiB, count=40848, average=168 B
/usr/lib/python3.8/html/parser.py:324: size=6285 KiB, count=85986, average=75 B
/usr/lib/python3.8/site-packages/bs4/element.py:772: size=5754 KiB, count=105215, average=56 B
/usr/lib/python3.8/html/parser.py:314: size=5334 KiB, count=105196, average=52 B
/usr/lib/python3.8/site-packages/bs4/__init__.py:587: size=4932 KiB, count=105197, average=48 B
Most of the files listed above are under the /bs4/ folder. Now, from the moment the page variable (line 82) is not stored anywhere, page.getData() is returning a dictionary and page.url a string, why do I have BeautifulSoup getting so much space in memory?
In line 72 you can see how the memory usage changed from ~40MB to ~291MB (considering the loop processed 10 URLs), it's a big change considering that the data I'm actually saving are a small dictionary and a string.
Am I having a problem with the garbage collector or I wrote something wrong?
I'm not very practical with Python, so I hope the point I've made with the profiling and debugging is correct.

How to find the average of every six cells in Excel

This is a relatively common question so I don't want to be voted down for asking something that has been asked before. I will explain as I go along the steps I took to answer this question using StackOver Flow and other sources so that you can see that I have made attempts to solve it without solving a question.
I have a set of values as below:
O P Q "R" Z
6307 586 92.07 1.34
3578 195 94.83 6.00
3147 234 93.08 4.29
3852 227 94.43 15.00
3843 171 95.74 5.10
3511 179 95.15 7.18
6446 648 90.87 1.44
4501 414 91.58 0.38
3435 212 94.19 6.23
I want to take the average of the first six values in row "R" and then put that average in the sixth column in the sixth row of Z as such:
O P Q "R" Z
6307 586 92.07 1.34
3578 195 94.83 6.00
3147 234 93.08 4.29
3852 227 94.43 15.00
3843 171 95.74 5.10
3511 179 95.15 7.18 6.49
6446 648 90.87 1.44
4501 414 91.58 0.38
3435 212 94.19 6.23
414 414 91.58 3.49
212 212 94.19 11.78
231 231 93.44 -1.59 3.6
191 191 94.59 2.68
176 176 91.45 .75
707 707 91.96 2.68
792 420 90.95 0.75
598 598 92.15 7.45
763 763 90.66 -4.02
652 652 91.01 3.75
858 445 58.43 2.30 2.30
I have utilized the following formula I obtained
=AVERAGE(OFFSET(R1510,COUNTA(R:R)-6,0,6,1))
but I received an answer that was different from what I obtained by simply taking the average of the six previous cells as such:
=AVERAGE(R1505:R1510)
I then tried the following code from a Stack OverFlow (excel averaging every 10 rows) conversation that was tangentially similar to what I wanted
=AVERAGE(INDEX(R:R,1+6*(ROW()-ROW($B$1))):INDEX(R:R,10*(ROW()- ROW($B$1)+1)))
but I was unable to get an answer that resembled what I got from taking a rote
==AVERAGE(R1517:R1522)
I also found another approach in the following but was unable to accurately change the coding (F3 to R1510, for example)
=AVERAGE(OFFSET(F3,COUNTA($R$1510:$R$1517)-1,,-6,))
Doing so gave me a negative number for a clearly positive set of data. It was -6.95.
Put this in Z1 and copy down:
=IF(MOD(ROW(),6)=0,AVERAGE(INDEX(R:R,ROW()-5):INDEX(R:R,ROW())),"")

CPU Higher than expected in Node running in docker

I have a vagrant machine running at 33% CPU on my Mac (10.9.5) when nothing is supposed to be happening. The VM machine is run by Kinematic. Looking inside one of the containers I see 2 node (v0.12.2) processes running at 3-4% CPU each.
root#49ab3ab54901:/usr/src# top -bc
top - 03:11:59 up 8:31, 0 users, load average: 0.13, 0.18, 0.22
Tasks: 7 total, 1 running, 6 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.2 us, 0.7 sy, 0.0 ni, 99.1 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem: 2051824 total, 1942836 used, 108988 free, 74572 buffers
KiB Swap: 1466848 total, 18924 used, 1447924 free. 326644 cached Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1 root 20 0 4332 672 656 S 0.0 0.0 0:00.10 /bin/sh -c node -e "require('./seed/seeder.js').seed().then(function (resp) { console.log('successfully seeded!'); pro+
15 root 20 0 737320 81008 13452 S 0.0 3.9 0:32.57 node /usr/local/bin/nodemon app/api.js
33 root 20 0 4332 740 652 S 0.0 0.0 0:00.00 sh -c node app/api.js
34 root 20 0 865080 68952 14244 S 0.0 3.4 0:01.70 node app/api.js
83 root 20 0 20272 3288 2776 S 0.0 0.2 0:00.11 bash
18563 root 20 0 20248 3152 2840 S 0.0 0.2 0:00.11 bash
18575 root 20 0 21808 2308 2040 R 0.0 0.1 0:00.00 top -bc
I went on and runned a node --prof and processed the log with node-tick-processor. It looks like that 99.3% of CPU is used in the syscall :
(for full output see http://pastebin.com/6qgFuFWK )
root#d6d78487e1ec:/usr/src# node-tick-processor isolate-0x26c0180-v8.log
...
Statistical profiling result from isolate-0x26c0180-v8.log, (130664 ticks, 0 unaccounted, 0 excluded).
...
[C++]:
ticks total nonlib name
129736 99.3% 99.3% syscall
160 0.1% 0.1% node::ContextifyScript::New(v8::FunctionCallbackInfo<v8::Value> const&)
124 0.1% 0.1% __write
73 0.1% 0.1% __xstat
18 0.0% 0.0% v8::internal::Heap::AllocateFixedArray(int, v8::internal::PretenureFlag)
18 0.0% 0.0% node::Stat(v8::FunctionCallbackInfo<v8::Value> const&)
17 0.0% 0.0% __lxstat
16 0.0% 0.0% node::Read(v8::FunctionCallbackInfo<v8::Value> const&)
...
1 0.0% 0.0% __fxstat
1 0.0% 0.0% _IO_default_xsputn
[GC]:
ticks total nonlib name
22 0.0%
[Bottom up (heavy) profile]:
Note: percentage shows a share of a particular caller in the total
amount of its parent calls.
Callers occupying less than 2.0% are not shown.
ticks parent name
129736 99.3% syscall
[Top down (heavy) profile]:
Note: callees occupying less than 0.1% are not shown.
inclusive self name
ticks total ticks total
129736 99.3% 129736 99.3% syscall
865 0.7% 0 0.0% Function: ~<anonymous> node.js:27:10
864 0.7% 0 0.0% LazyCompile: ~startup node.js:30:19
851 0.7% 0 0.0% LazyCompile: ~Module.runMain module.js:499:26
799 0.6% 0 0.0% LazyCompile: Module._load module.js:273:24
795 0.6% 0 0.0% LazyCompile: ~Module.load module.js:345:33
794 0.6% 0 0.0% LazyCompile: ~Module._extensions..js module.js:476:37
792 0.6% 0 0.0% LazyCompile: ~Module._compile module.js:378:37
791 0.6% 0 0.0% Function: ~<anonymous> /usr/src/app/api.js:1:11
791 0.6% 0 0.0% LazyCompile: ~require module.js:383:19
791 0.6% 0 0.0% LazyCompile: ~Module.require module.js:362:36
791 0.6% 0 0.0% LazyCompile: Module._load module.js:273:24
788 0.6% 0 0.0% LazyCompile: ~Module.load module.js:345:33
786 0.6% 0 0.0% LazyCompile: ~Module._extensions..js module.js:476:37
783 0.6% 0 0.0% LazyCompile: ~Module._compile module.js:378:37
644 0.5% 0 0.0% Function: ~<anonymous> /usr/src/app/api.authentication.js:1:11
627 0.5% 0 0.0%
...
A strace resulted in nothing abnormal:
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
54.51 0.001681 76 22 clone
17.28 0.000533 4 132 epoll_ctl
16.80 0.000518 24 22 wait4
6.39 0.000197 2 110 66 stat
5.03 0.000155 1 176 close
0.00 0.000000 0 176 read
0.00 0.000000 0 88 write
0.00 0.000000 0 44 rt_sigaction
0.00 0.000000 0 88 rt_sigprocmask
0.00 0.000000 0 22 rt_sigreturn
0.00 0.000000 0 66 ioctl
0.00 0.000000 0 66 socketpair
0.00 0.000000 0 88 epoll_wait
0.00 0.000000 0 22 pipe2
------ ----------- ----------- --------- --------- ----------------
100.00 0.003084 1122 66 total
And the other node process:
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
0.00 0.000000 0 14 epoll_wait
------ ----------- ----------- --------- --------- ----------------
100.00 0.000000 14 total
Am I missing something?
I wonder if it is VirtualBox's or Docker's layers consuming 4%.
When you have a few containers with 2 processes running at 4%, it adds up quickly.

Resources