Author has an updated_date of 2021-12-30 it will be prefixed/data/authors/updated_date=2021-12-30/.updated_date partitions aren't important yet. You need all the entities, so for Authors you would get /data/authors/*/*.gzupdated_date partition. Each is under 2GB./data/works/manifest lists all the works.updated_date partitions make this easy, but the way they work may be unfamiliar. Unlike a set of dated snapshots that each contain the full dataset as of a certain date, each partition contains the records that last changed on that date.Authors, each being newly created on that date, /data/authors/ looks like this:Authors, they would come out of one of the files in /data/authors/updated_date=2021-12-30 and go into one in /data/authors/updated_date=2022-01-04:/data/authors/updated_date=2022-01-04 to get everything that was changed or added since then.X, insert or update the records in objects where updated_date > X.Author partitions and the number of records in each (in the actual dataset):updated_date=2021-12-30/ - 62,573,099 updated_date=2022-12-31/ - 97,559,192 updated_date=2022-01-01/ - 46,766,699 updated_date=2022-01-02/ - 1,352,773manifest file updated_date partition for an entity, we'll delete that entity's manifest file. When we finish writing the partition, we'll recreate the manifest, including the newly-created objects. So if manifest is there, all the entities are there too.s3://openalex/data/authors/manifest.url property of each item in the entries list.updated_date you haven't seen before.s3://openalex/data/authors/manifest again. If it hasn't changed since (1), no records moved around and any date partitions you downloaded are valid.Author per line. Insert or update into your database of choice, using each entity's ID as a primary key.