Description: The Senior Data Scientist designs and develops analytics, algorithms, and visualizations to enable stakeholders to gain information, knowledge, trends, and probabilities from large datasets. Works with subject matter experts, users, and stakeholders to understand needs, performance metrics, and to design experiments. Analyzes and prepares datasets, including data normalization, data model mapping, and development of semantics-based models. Uses machine learning, statistical, graph-based algorithms to analyze and draw conclusions from datasets. Implements prototypes and visualizations and assists with integration of these prototypes into production frameworks and analyst workflows.
Responsibilities:
- Work with subject matters experts to identify important information in raw data from a variety of data formats (e.g., SQL tables, structured metadata, network logs, free text)
- Analyze customer needs, project management factors, and available techniques in order to choose optimal data analysis approaches
- Fix errors in input data; Normalize data into consistent format; Interpret data to fit existing data models; Develop feature vectors for input into machine learning algorithms
- Assist in building of taxonomies, ontologies, and other semantics-based data models
- Develop and train machine learning systems based on statistical analysis of data characteristics to support mission automation
- Work with very large datasets
- Leverage machine learning algorithms (eg, decision trees, Bayesian networks, support vector machines, k-means) in order to develop analytics for categorization, classification
- Use standard statistical techniques and tests (null hypothesis, confusion matrices, cross-validation, sample size, significance, hypothesis testing) to inform analysis and to make data-driven recommendations and decisions
- Generate reports and visualizations that summarize datasets and provide data-driven insights to customers
- Communicate technical concepts to stakeholders and users
Required Qualifications:
- Eight (8) years experience as a Data Scientist
- B.S. Degree in Data Science, Information Sciences, Informatics, Mathematics, Computer Science or related field from an accredited college or university
- Active Top Secret (TS/SCI) clearance with polygraph
Desired Experience:
- Data science and visualization platforms, such as Jupyter Notebooks, Tableau, Kibana, Splunk, Grafana TensorFlow, Spark ML, Weka, R, MatLab, SAS
- Scripting, using Python or Perl
- Experience using machine-learning algorithms, such as decision trees, Bayesian networks, k-means, support vector machines
- Experience with basic statistical techniques, such as t-tests, chi-square, ANOVA, standard deviation, p-value tests, Monte Carlo methods
- Natural language processing, named entity recognition, topic categorization