Amazon EMR is a managed Hadoop framework that makes it easy and fast to do mass data processing. EMR is great for handling tasks like log analysis, web indexing, data transformations (ETL).
Check if you need:
Remember that S3 doesn’t know about subfolders/subdirectories. A slash is just a slash. There’s only one level of folders called buckets. Inside buckets are files, called objects (aka keys)
Instead, S3 uses a Prefix to filter the list. Only keys with a matching prefix are displayed.
S3 also allows a delimiter with the --delimiter=X
or -d
(to use /
as the delimiter)