|3-5K EUR / Month
As a Big data engineer you will be part of a company that is building a refined search engine for business data across all industries. The team already works with some of the largest companies in the world. The company takes an approach that’s similar to Google’s by providing business data enrichment with unprecedented coverage of SMBs, accurate classification and in-depth insights based on real-time updates.
You will develop the data extraction and processing mechanisms inside our big data infrastructure (Spark, Cassandra), which currently supports the processing of 6GB of data every minute. Your mission will be to improve the overall quality, width, and depth of the data collected on companies worldwide.
- Mine and analyze data from across the web
- Assess the effectiveness and accuracy of new data sources and data-gathering techniques
- Develop custom data models and algorithms to apply to text datasets
- Use predictive modeling to increase and optimize data extraction and data quality at ingestion and post-processing
- Develop the company’s A/B data testing framework and test the quality of the data continuously
- Develop processes and tools to monitor and analyze model performance and data accuracy.
- Prototype quickly to solve thorny use cases, without getting stuck in theory, as we’re prone to shipping early and often
- Write well-designed, testable, efficient code
- Identify areas of opportunity and improvement
- Experience using statistical computer languages, preferably Scala (or R, Python, SQL, etc.) to manipulate data and draw insights from (very!) large data sets
- Experience working with and creating data architectures (Spark, Cassandra)
- Knowledge of advanced statistical techniques and concepts (regression, properties of distributions, statistical tests and proper usage, etc.) and experience with applications
- Any knowledge of machine learning techniques (clustering, decision tree learning, artificial neural networks etc.) and their real-world advantages/drawbacks is a big plus
Interested in finding more?