- Audience Modeling With Spark ML Pipelines
in Data mining, Statistika > SW > Apache Spark with apache datamining library ml pipeline spark transform
- DeepDive - finds correlations in raw text files and other files that aren’t organized
DeepDive is a system to extract value from dark data. Like dark matter, dark data is the great mass of data buried in text, tables, figures, and images, which lacks structure and so is essentially unprocessable by existing software. DeepDive helps bring dark data to light by creating structured data (SQL tables) from unstructured information (text documents) and integrate such data with an existing structured database. DeepDive is used to extract sophisticated relationships between entities and make inferences about facts involving those entities. DeepDive helps one process a wide variety of dark data and put the results into a database. Once in a database, one can use a variety of standard tools that consume structured data, e.g., visualization tools like
in Data mining, Statistika > SW with datamining text unstructured
- Improve Your Model Performance using Cross Validation (in R and Python)
in Data mining, Statistika > Studium with cross datamining model validation x
- Kaggle
in Data mining, Statistika > Competitions with competition datamining kaggle
datamining from all users