The 1000 per request limit doesn't limit the number of objects in a bucket
or their names. It's solely to make sure the REST API doesn't get bogged
down trying to return 1M objects in a list at once. When a bucket has more
call, starting from the marker. You just have to issue a second call.
s3cmd does this automatically for 'ls' and most list operations. It's
than 1000 parts for a given object; it doesn't issue the subsequent calls.
Post by Russell GaddThanks Matt. I suspected the files-from wouldn't work with ls.
But your mention of a limit of 1000 on list operations worries me, as my
plan was to put about 25000 files into one folder with a name as just the
32 hex character MD5 with no subfolder heirarchy. It would seem that I
could then not get a list of all the objects. I was totally unaware of this
arbitrary limit.
(I'm thinking aloud now)
One method would perhaps be to issue 256 requests for all items beginning
with 2 characters from 00 to ff. I have about 30000 files so this would
give me an average of about 120 files per subset although there could be an
outlier 2 character prefix with a number of files above 1000. In that case
would I get just 1000 responses? And I don't think there'd be an obvious
way to get the remaining files. However if they came in alphabetic order
perhaps I'd only have to issue a few more requests to get files beginning
say xyd to xyf assuming I already had say all xy0 to xyc. Sounds like an
exercise in recursive algorithms. Actually if done right it will be much
less than 256.
Maybe I need to rethink, but I won't be looking for a complete list very
often so perhaps it might be ok.
Russell
Post by Matt Domsch[ls] doesn't honor the --files-from option. [ls] simply asks S3 for all
the files in a bucket, possibly recursively, starting from a given prefix.
Jeremy is correct that it doesn't matter if a request returns 0 bytes or
a list of 1000 objects, it's counted as one request. Most operations have
a limit as to the number of items they can operate on (e.g. list bucket and
multiple object delete have a limit of 1000 objects for each
operation/request). If though, given a list of 1000 objects, we do a
metadata HEAD request for each object, then you'll have made 1001
requests. (we don't get metadata for every object anymore though, only
when we need it).
Post by Russell Gadd1. Can you tell me if --files-from is an available option for the ls
command? I've experimented to find out but without success. (Example: s3cmd
-r --files-from=testlist.txt ls s3://xyztestbucket). Probably not but I
just wanted to check. It's not clear in the documentation although I
suspect most people probably haven't got a use for it. So please confirm
that --files-from doesn't apply to ls or else tell me how to specify the
command and the list of files (i.e. is s3//bucket-name required at the
front of each file). In my proposed usage it would be useful as it would
verify the existence of specific files. If not available I will have to
issue one command per file unless I list the whole lot since I'm not using
folders.
2. I'm not sure of the meaning of "requests" in the pricing of get or
list requests, which for EU-West is $.004 / 1000 requests.
Does this mean $.004 for a request which returns 1000 file names or
literally 1000 lists each of which could return any number of filenames?
Actually it's probably small beer for my usage but it would be nice to know.
Russell
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website,
sponsored
by Intel and developed in partnership with Slashdot Media, is your hub
for all
things parallel software development, from weekly thought leadership
blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
S3tools-general mailing list
https://lists.sourceforge.net/lists/listinfo/s3tools-general
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website,
sponsored
by Intel and developed in partnership with Slashdot Media, is your hub
for all
things parallel software development, from weekly thought leadership
blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
S3tools-general mailing list
https://lists.sourceforge.net/lists/listinfo/s3tools-general
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website,
sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for
all
things parallel software development, from weekly thought leadership blogs
to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
S3tools-general mailing list
https://lists.sourceforge.net/lists/listinfo/s3tools-general