Responsibilities To develop and apply bleeding edge machine learning algorithms and statistical pattern recognition on extremely large text corpora in the capital markets domain. Utilize statistical natural language processing to mine unstructured data, and create insights; analyze and model structured data using advanced statistical methods and implement algorithms and software needed to perform analyses Build document clustering, topic analysis, text classification, named entity recognition, sentiment analysis, and part-of-speech tagging methods for unstructured and semi-structured data Cluster and analyze large amounts of user generated content and process data in large-scale environments using Amazon EC2, Storm, Hadoop and Spark Develop and perform text classification using methods such as logistic regression, decision trees, support vector machines and maximum entropy classifiers Perform text mining, generate and test working hypotheses, prepare and analyze historical data and identify patterns Generate creative solutions (patents) and publish research results in top conferences (papers) Technology Stack for the Resultant Application Data Storage Analytics AWS/ Cloudera (on-Premise) Hadoop Ecosystem with MongoDB & Elastic Search on S3 or on Premise Queueing System RabbitMQ Programming Python Front-End HTML5 for WebApp and Objective C for ios. Potentially hybrid framework for iOS and Android CDN AWS or On Premise Routing


Klickto Search (Rec.)


Required Skills Advanced degree from an accredited college/university in Computer Science, Computational Linguistics, Applied Math or Statistics, Engineering, Bioinformatics, Physics, O.R., or related (strong math/stats background with an ability to understand algorithms and methods from both mathematical and intuitive viewpoints) In-depth knowledge of various NLP domains such as entity extraction, speech recognition, topic modeling, machine translation, natural language understanding, parsing, question answering, etc Expertise in text mining (probabilistic topic model, word association mining, ontology learning, opinion mining and sentiment analysis, semantic similarity, etc.) Expertise in natural language processing/understanding (word representation, sentiment analysis, relation extraction, natural language inference, semantic parsing, etc.) Excellent background in machine learning (generative model, discriminative model, neural network, regression, classification, clustering, etc.) Experience in deep learning on NLP/NLU is a big plus Extensive experiences in using NLP related techniques/algorithms such as HMM, CRF, deep learning & recurrent ANN, word2vec/doc2vec, Bayesian modeling, etc Success in building strong ontology / taxonomies Strong data extraction and processing skills and experience Experience in applied statistics including sampling approaches, experiments, modeling, and data mining techniques Experience building analytical models and working with structured and unstructured data sets Deep expertise in implementing algorithms in python. Experience with data structures and algorithms and ability to work in a Unix environment, processing large amounts of data in a big data environment Significant experience building robust data processing and analytics pipelines Experience with one or more modern Big Data technology stacks Contributions to research communities, e.g. ACL, NIPS, ICML, CVPR, etc. is a Plus

Educational level:


Tagged as: , , , , , ,

You can apply to this job and others using your online resume. Click the link below to submit your online resume and email your application to this employer.