Pro-tip: You need to disable Google Advanced Protection or your Google Takeout request will be stuck in Scheduled indefiniely.


Every 6-12 months I do a Google Takeout export of my Google Photos and either I’ve never noticed this before, or this is a new issue.

I told Takeout to put the files into Google Drive, and in total there were 45 .tgz files, totaling 2.2TB.

All of the .tgz files together equal 2.2TB. Extracting them all as I always do (pv takeout-* | tar xzif), results in 1.1TB of files (yes, suspiciously, half).

What’s weird is that when Google Takeout put all of the tgz’s into Google Drive, there were a number of duplicates (e.g. 2x takeout-20250610T130000Z-1-013.tgz). So I renamed them so they were unique (e.g. to takeout-20250610T130000Z-1-013-0.tgz and takeout-20250610T130000Z-1-013-1.tgz), downloaded them all, and extracted them, and got 1.1TB.

However the weird thing that the “duplicates” aren’t the same size:

-rwxrwxr-x 1 nobody nogroup 50700M Jun 17 15:46 takeout-20250610T130000Z-1-011-0.tgz
-rwxrwxr-x 1 nobody nogroup 50760M Jun 17 15:42 takeout-20250610T130000Z-1-011-1.tgz
-rwxrwxr-x 1 nobody nogroup 50724M Jun 17 15:46 takeout-20250610T130000Z-1-012-0.tgz
-rwxrwxr-x 1 nobody nogroup 50771M Jun 17 15:43 takeout-20250610T130000Z-1-012-1.tgz
-rwxrwxr-x 1 nobody nogroup 50981M Jun 17 15:46 takeout-20250610T130000Z-1-013-0.tgz
-rwxrwxr-x 1 nobody nogroup 50589M Jun 17 15:43 takeout-20250610T130000Z-1-013-1.tgz

and obviously the md5’s are different.

And even more strange, I ran:

for file in $(ls takeout-*.tgz | sort -V); do echo “$file”; tar -tzf “$file” | grep ‘Takeout/Google Photos/’; done > files.txt

And the number of files is 269965, the number of files that get extracted is 134970

It’s a Google Workspace account, so I can’t see in the Google Photos app how much storage I’m using (because of course that makes sense), and in the Google Admin portal, it says my user is using ~830GB, which makes no sense.