What’s up with OpenStack Swift metadata

The other day I got interested in what attributes OpenStack Swift is actually storing along with a data.

First I had to determine the actual partition where Swift is storing the data. In my case I had a Swift ring prepared with a replication count of 2 so the data can only exist in two partitions.

The easy way to lookup this information is by using the swift-get-nodes:

swift-get-nodes <ring file> <URL containing account+container+path>
# swift-get-nodes /etc/swift/object-1.ring.gz /AUTH_e1496568b6864cb1b52cdfe7436c213f/test/root/hummingbird |grep lah

ssh 172.29.244.100 "ls -lah ${DEVICE:-/srv/node*}/swift4.img/objects-1/39/83a/27ea485e7f147e5e47f9c38dd0feb83a"

ssh 172.29.244.100 "ls -lah ${DEVICE:-/srv/node*}/swift5.img/objects-1/39/83a/27ea485e7f147e5e47f9c38dd0feb83a"

Lets change into the partition and retrieve all xattr keys and values:

# cd /srv/swift4.img/objects-1/39/83a/27ea485e7f147e5e47f9c38dd0feb83a/
# getfattr -dm '.*' 1469228674.06028.data

# file: 1469228674.06028.data
 user.swift.metadata="�}q(UC⎺┼├e┼├↑Le┼±├▒─12321▮41U┼▒└e─U>>"

Using the key name user.swift.metadata, I found out that the value for this key is a Python pickle object: https://github.com/openstack/swift/blob/master/swift/obj/diskfile.py#L133

Now let’s uncover the data of the pickle object:

Python 2.7.6 (default, Jun 22 2015, 17:58:13)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.


>>> import xattr
>>> import pickle

>>> fh = open('1469228674.06028.data')

>>> xattr.listxattr(fh)
 (u'user.swift.metadata',)

>>> xattr.getxattr(fh,'user.swift.metadata')
 '\x80\x02}q\x01(U\x0eContent-Lengthq\x02U\x0812321041U\x04nameq\x03U>> 

>>> p = pickle.loads(xattr.getxattr(fh,'user.swift.metadata'))

>>> print p
 {'Content-Length': '12321041', 'name': '/AUTH_e1496568b6864cb1b52cdfe7436c213f/test/root/hummingbird', 'Content-Type': 'application/octet-stream', 'ETag': 'd645ab07a4a452abeeb7f3ad0ec0f7db', 'X-Timestamp': '1469228674.06028', 'X-Object-Meta-Mtime': '1466738039.627804'}

Here it is, the usual suspects are stored. This metadata is actually returned with each stat request. Quite clever, that way Swift does not need rehash or read additional attributes per file it serves.