Skip to content

Instantly share code, notes, and snippets.

@bosswissam
Last active July 18, 2022 03:04
Show Gist options
  • Save bosswissam/a369b7a31d9dcab46b4a034be7d263b2 to your computer and use it in GitHub Desktop.
Save bosswissam/a369b7a31d9dcab46b4a034be7d263b2 to your computer and use it in GitHub Desktop.
import sys
def get_size(obj, seen=None):
"""Recursively finds size of objects"""
size = sys.getsizeof(obj)
if seen is None:
seen = set()
obj_id = id(obj)
if obj_id in seen:
return 0
# Important mark as seen *before* entering recursion to gracefully handle
# self-referential objects
seen.add(obj_id)
if isinstance(obj, dict):
size += sum([get_size(v, seen) for v in obj.values()])
size += sum([get_size(k, seen) for k in obj.keys()])
elif hasattr(obj, '__dict__'):
size += get_size(obj.__dict__, seen)
elif hasattr(obj, '__iter__') and not isinstance(obj, (str, bytes, bytearray)):
size += sum([get_size(i, seen) for i in obj])
return size
@harishvc
Copy link

Amazing! 👍

aa = [[],['a'],['a','b'],[1], [123], [123,456]]
for a in aa:
    print("%s size=%d" %(a, get_size(a)))

#output
[] size=64
['a'] size=122
['a', 'b'] size=180
[1] size=100
[123] size=100
[123, 456] size=136

@mavillan
Copy link

Thanks!

Copy link

ghost commented Sep 13, 2017

This is interesting, I have some confusions I hope you will help me on that. Ok so, for getting the size of all attributes from the object what I'm doing is multiplying the length len(object) of the object with the size of one element(single) from the object like this sys.getsize(1) * len(object) and this solves my simple problem/feature, however, I'm not getting practically what are the sub-attributes of an objects that I cannot get size by the simple approach.
sample script :

 import sys
 simple_list = range(3)
 print("size of list", sys.getsizeof(1) * len(simple_list))

@aliartiza75
Copy link

aliartiza75 commented May 2, 2018

Amazing code ! thanks

@AnshulBasia
Copy link

AnshulBasia commented Jun 8, 2018

@bosswissam

Is this not weird?

>>> from pysize import get_size
>>> get_size(b)
152
>>> get_size(b[0])
28
>>> get_size(b[1])
28
>>> len(b)
300

@josiahjohnston
Copy link

josiahjohnston commented Oct 12, 2018

I've been happily using this code for a long time, but I just encountered a use case where this breaks down: a class built over a simple namedtuple data core. This pattern is desirable for certain multi-processing/cloud computing contexts.

from __future__ import print_function
from collections import namedtuple
import sys
import numpy as np

my_tup = namedtuple('MyNamedTuple', ['Array','Name'])
class my_class(my_tup):
    def __init__(self, *kwargs):
        super(my_class, self).__init__(*kwargs)
    # Add workhorse functions...

dat_tuple = my_tup(np.zeros([1000,1000]), 'long name'*10)
dat_obj = my_class(np.zeros([1000,1000]), 'long name'*10)
print(get_size(dat_tuple), get_size(dat_obj))

These sizes should be almost the same, but they are not.

8000946 360

The problem is caused because dat_obj has an empty __dict__ and data stored in __iter__.

Here is the fix I made. It doesn't come out exactly the same, but it's a lot closer than before:

def get_size2(obj, seen=None):
    """Recursively finds size of objects"""
    size = sys.getsizeof(obj)
    if seen is None:
        seen = set()
    obj_id = id(obj)
    if obj_id in seen:
        return 0
    # Important mark as seen *before* entering recursion to gracefully handle
    # self-referential objects
    seen.add(obj_id)
    if isinstance(obj, dict):
        size += sum([get_size(v, seen) for v in obj.values()])
        size += sum([get_size(k, seen) for k in obj.keys()])
    elif hasattr(obj, '__iter__') and not isinstance(obj, (str, bytes, bytearray)):
        size += sum([get_size(i, seen) for i in obj])
        if hasattr(obj, '__dict__'):
            size += get_size(obj.__dict__.values(), seen)
    elif hasattr(obj, '__dict__'):
        size += get_size(obj.__dict__, seen)
    return size

print(get_size2(dat_tuple), get_size2(dat_obj))

8000671 8000647

@liran-funaro
Copy link

I implemented a truly generic solution here.
It uses the gc module instead of trying to guess the object children and avoid using recursive calls.
You can simply install it via pip if you want: pip install objsize.

@DifferentialPupil
Copy link

DifferentialPupil commented Jul 21, 2020

There is python module that provides similar functionality and other things as well such as tracking the memory consumption of the instances of a specific class, etc. called Pympler.
https://pympler.readthedocs.io/en/latest/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment