From Material Science to 机器学习 Engineer at Capital One


David Steinmetz went from a Materials Science PhD to a 机器学习 Engineer Capital One. 他自称是有战略头脑的分析师, David dabbled in a few online data science courses before deciding to commit to a full bootcamp.

We sat down with David to learn why he chose to attend 赌博10大排行网站, and how he's able to utilize that experience in his current job.

告诉我们你的背景. 它是否自然而然地将你引向了数据科学? 如果不是,是什么促使你进入这个领域的呢?

I studied materials science, which is a blend of math, physics, and chemistry. I noticed that programming was integral to solving many of the materials science problems, which also tended to be steeped in sophisticated math. I used a genetic algorithm during my undergrad and a particle swarm algorithm during my PhD. 毕业之后, I joined a management consultancy and learned about business and data analysis at companies. That background coupled with a genuine interest in computers naturally led to data science.

是什么让你考虑参加赌博10大排行网站营? Were there other things you were considering at the time?

I took online data science courses to get my feet wet. 当我意识到我真的很喜欢这份工作时, I looked at how to learn as much material as possible as quickly as possible. A friend mentioned the bootcamps, and I quickly decided it was the right move for me. I had been considering jobs in materials science, but data science drew me in.

What skills were most useful in helping you land your position at Capital One?

The skills that were most useful were practical experience with a number of 机器学习 algorithms, 项目工作显示在Github, and a working knowledge of data structures and algorithms. The question an interviewer is really asking is “can this person do the job”. The more project work you have on your public profile, the less of a risk it will be to hire you because the hiring manager can already see your abilities.

What are the tools you find most relevant to your position? What are the skills you thought that was most important?

Python, AWS, Github, Scala, and Spark are the tools which are most relevant to my current position and project. I use Pandas and Spark Datasets often, and Github always. I thought R would be used more, but it’s not, because it’s harder to use R in production. I also thought I would rely more on the standard 机器学习 libraries, but we don’t hesitate to implement an algorithm that doesn’t exist in Scitkit-Learn or MLlib if it suits our purposes.

Can you describe your day to day job as a 数据科学家?

Often I spend time reading original research papers and books in the attempt to find state-of-the-art approaches to the problem I am trying to solve. 剩下的时间花在编程上, 可视化数据, 听取同事的意见, and creating new products to solve our clients’ needs. I use cloud services and open source software extensively, allowing me to iterate quickly and try new approaches.


It’s varied, mentally challenging, and at the cutting edge of implemented 机器学习. 和我一起工作的人都非常迷人, and it’s motivating and an honor to be able to work with them.

What are skills your team looks for in a 数据科学家?

我们要找有好奇心的人, passionate, and well-rounded in the sense that they have experience both with data engineering and distributed systems as well as data science and 机器学习. Since we work so much in the cloud, knowledge of cloud services is a plus. A lot of work is done in Scala and Java, so knowledge of one of those two also helps.

What advice do you have for people looking to enter the field?

在这个领域有很多东西需要学习, so pick one thing and learn it well before moving on to another. Learning many things superficially will backfire once you get into the interview or onto the job. There deep understanding and the capacity for further learning is necessary. A bootcamp is a great way to get both the deep understanding and cover the breadth of material necessary to get you started in the field. 无论你做什么, 获得关于学习什么的建议, otherwise what you are learning might not be best suited to your situation.


About Author



Jack 2017年10月26日
唱worshіp的歌是很好的,但是?这不是ԝ的唯一解决方案.? 爸爸statеd, perhapѕ到maқe拉里停止唱歌. ?Thеre有很多wayѕ敬拜.
舒如提Agrawal 2017年10月7日
Thank yo David for the insightful interview about beginning a career in data science. In the last answer it is mentioned that it is important to know what to learn. 你能告诉我我们怎么知道要学什么吗? 我正在纠结这个问题. 这是否取决于目标工作? Then it could be hard because one does not know which job in which company is one going to get. Also it could be too late till one has leaent , the vacancy already might have been filled. So, how can we comprehend what should one focus amongst the plethora of knowledge in data science?




#python # trainwithnycdsa 2019 airbnb 亚历克斯Baransky alumni 校友面试 校友的评论 校友关注 alumni story Alumnus API Application artist aws bank loans 美丽的汤 最好的赌博10大排行网站营 2019年度最佳数据科学 最佳数据科学赌博10大排行网站营 最佳数据科学赌博10大排行网站营2020 Best Ranked Big Data Book Launch Book-Signing bootcamp 赌博10大排行网站营的校友 赌博10大排行网站营准备 Bundles California 癌症研究 capstone Career Career Day citibike clustering Coding Course Demo 课程报告 D3.js data Data Analyst 数据分析 data science 赌博10大排行网站 数据科学赌博10大排行网站营 数据科学工作 数据科学评论 数据科学家 数据科学家的工作 数据可视化 深度学习 Demo Day Discount dplyr 雇主网络 工程特性 Finance 财务数据的科学 Flask gbm Get Hired ggplot2 googleVis Hadoop higgs boson Hiring 招聘合作伙伴活动 招聘合作伙伴 行业专家 老师的博客 教师面试 italki Job 就业安置 Jobs Jon Krohn 摩根大通(JP Morgan Chase) Kaggle Kickstarter 套索回归 导致数据Scienctist 导致数据科学家 leaflet 线性回归 逻辑回归 机器学习 Maps matplotlib 医学研究 满足团队 meetup music Networking 神经网络 神经网络 New Courses nlp NYC 纽约数据科学 赌博10大排行网站 纽约市公开的数据 NYCDSA NYCDSA校友 Online 在线赌博10大排行网站营 在线培训 Open Data painter pandas Part-time 投资发展 prediction Prework Programming PwC python Python数据分析 python机器学习 python scrapy python web抓取 python webscraping Python车间 R R数据分析 R language R编程 R Shiny r studio R可视化 R Workshop R-bloggers 随机森林 Ranking 建议 推荐系统 regression Remote 远程数据科学赌博10大排行网站营 Scrapy scrapy可视化 seaborn seafood type Selenium 情绪分析 Shiny 闪亮的仪表板 Spark Special 特别的夏天 Sports statistics streaming 学生面试 学生展示 SVM Switchup Tableau teachers team TensorFlow Testimonial tf-idf 顶级数据科学赌博10大排行网站营 twitter 可视化 web scraping 周末的课程 会发生什么 word cloud word2vec XGBoost yelp