Top products from r/dataengineering

We found 10 product mentions on r/dataengineering. We ranked the 9 resulting products by number of redditors who mentioned them. Here are the top 20.

Next page

Top comments that mention products on r/dataengineering:

u/imcguyver · 1 pointr/dataengineering

https://www.amazon.com/Data-Warehouse-Toolkit-Complete-Dimensional/dp/0471200247

The is a highly recommended book for the Data Warehouse industry. Hope you enjoy it and good luck.

u/kristofleroux · 1 pointr/dataengineering

I'm reading this book now:

https://www.amazon.com/Agile-Data-Science-2-0-Applications/dp/1491960116

And ok it's already 2 years old but it is amazing, it depicts the complete agile data science process while using, kafka, spark (core, streaming, sql, mlib), airflow, elasticsearch, mongodb, scikitlearn, d3js and how to improve and deploy your pipeline.

u/yahelc · 4 pointsr/dataengineering

The most important reading from a database design perspective, IMO, is one of Kimball’s books:

https://www.amazon.com/Data-Warehouse-Toolkit-Definitive-Dimensional/dp/1118530802

It’s less technically focused, and more focused on how to build good datasets. It’s an older text so it’s references to specific technologies are a bit out of date, but when it comes to describing how to design particular schemas (or at least speak the language of people who design schemas), it’s pretty much canon.

u/Chr0nomaton · 1 pointr/dataengineering

I attended UCI for CS, and am going through the process of masters right now. I'm a data engineer / data platform engineer at a startup, and have been doing it for ~2 years or so. I find that the traditional CS knowledge is a tool belt that you don't necessarily *need* to get through industry.

​

There are a lot of really good algorithm books out there, O'Reilly has Algorithms In A Nutshell which does talk about O notation, and then a walk through of some basic data structures and algorithms (Linked list, trees, sorting). DS and Algos are really like the *core* CS things that one would need. Some community colleges offer these courses, which might be better depending on your circumstance.

​

The upper division classes are useful I think. I took a few classes on distributed systems and computer architecture which have been insurmountable. I took a class on databases (useful I suppose but meh), some classes on machine learning and artificial intelligence and operating systems. Those have become more useful now that I'm doing data platform work.

​

All that being said, I think the only disadvantages you have are the terminology ("This will give o(nlogn) lookup while retaining referential integrity") and the boxes to tick. Terminology though you can learn. The boxes to tick though might be tougher. I think some companies will be really stingy about that stuff. You did say that you have an undergraduate education though so I don't think that will matter.

u/t-vanderwal · 3 pointsr/dataengineering

Definitely. We actually used that book for my Business Intelligence masters course in my MIS program. I met a BI manager hiring for a data engineering role and she recommended the following text as well. The content was pretty similar as they focus on the Kimball method but goes over BEAM*, which is a requirements gathering framework for designing data warehouses.

https://www.amazon.com/Agile-Data-Warehouse-Design-Collaborative/dp/0956817203/ref=sr_1_1?s=books&ie=UTF8&qid=1511661160&sr=1-1&keywords=agile+data+warehouse+design

u/jwfergus · 3 pointsr/dataengineering

Re-iterating what the previous posters said: the fundamentals are the same regardless of system. Learning how to get data out of a SQL system is all about learning how to write SQL.

To effectively learn how to write SQL for data engineering, I highly recommend grabbing a book like one of these*:

  1. SQL Quickstart Guide
  2. SQL Queries
  3. If you're an experience programmer maybe T-SQL Fundamentals (Microsoft flavor SQL)

    and grabbing a sample database for the system of your choice:

  4. MySQL sample Employee db
  5. PostgreSQL sample dbs
  6. SQL Server - stackoverflow db

    and then practice some of your chosen book on the sample db.

    Notes and words of warning:

  • Writing SQL for data engineering or programming is really different than "database administration." A lot of resources on the web are geared towards DBAs and it probably won't help you out much.
  • University courses on databases tend to be more theoretical than practical, for the sake of learning how to write SQL. University isn't a super efficient method of learning to write SQL.

    ^((*I'm not affiliated w/ any of those books))