Archived on 21 November, 2024 (8 months ago)
Job Description
We are expanding our engineering team. We're looking for someone who is excited about our non-profit, open data mission, proficient with Python, and hopefully also some Java. Proficiency at cloud systems such as Spark/PySpark is required. Willingness to learn the rest: crawling, parsing, indexing, etc.Remote
Salary
Not Specified
Benefits
Not Specified
Tech Tags
PythonJavaSparkPySparkCloud Systems
Part Time
Date Listed
01 May, 2024 (about 1 year ago)
Common Crawl Foundation
The Common Crawl Foundation has a 17-year-old, 8 petabyte crawl & archive of the web. Their open dataset has been cited in nearly 10,000 research papers and is the most-used dataset in the AWS Open Data program. The organization is also very active in the open source community.
This job is archived, but you can still apply.
Common Crawl Foundation has 1 other job listed