Archived on 02 January, 2025 (6 months ago)
Job Description
We are expanding our engineering team and looking for people who are excited about our non-profit, open data mission. Candidates should be proficient with Python, and hopefully also some Java, and proficient at cloud systems such as Spark/PySpark. Our current team is composed of engineers who do some data science, and data scientists who do some engineering. We are focused on improving our crawl, making new data products, and using these new data products to improve our crawl.Remote
Salary
Not Specified
Benefits
Not Specified
Tech Tags
PythonJavaSparkPySparkCloud Systems
Part Time
Date Listed
01 November, 2024 (8 months ago)
Common Crawl Foundation
The Common Crawl Foundation has a 17-year-old, 8 petabyte crawl & archive of the web. Their open dataset has been cited in nearly 10,000 research papers and is the most-used dataset in the AWS Open Data program. The organization is also very active in the open source community.
This job is archived, but you can still apply.
Common Crawl Foundation has 1 other job listed