Top products from r/hadoop

We found 9 product mentions on r/hadoop. We ranked the 9 resulting products by number of redditors who mentioned them. Here are the top 20.

5. Kingston Technology ValueRAM 8GB Kit (2x4GB) DDR3 1333 MHz DIMM Desktop Server Memory KVR1333D3E9SK2/8G

Sentiment score: 0

Number of reviews: 1

Specifically designed and tested for compatibility240-pin unbuffered DIMMBacked by a lifetime warranty and free technical support

Kingston Technology ValueRAM 8GB Kit (2x4GB) DDR3 1333 MHz DIMM Desktop Server Memory KVR1333D3E9SK2/8G

Show Reddit reviews

6. Intel Xeon Quad-Core Processor E3-1230 v2 3.3GHz 8MB LGA 1155 CPU LGA BX80637E31230V2

Sentiment score: 0

Number of reviews: 1

Model: Intel Xeon Processor E3-1230 V2Core Count: 4Clock Speed: 3.3 GHzCache: 8 MBSocket: LGA1155Extended Memory 64 TechnologyEnhanced Speed Step TechnologyDemand Based Switching

Intel Xeon Quad-Core Processor E3-1230 v2 3.3GHz 8MB LGA 1155 CPU LGA BX80637E31230V2

Show Reddit reviews

7. 10 PACK - SanDisk 8GB MicroSD HC Memory Card SDSDQAB-008G (Bulk Packaging) LOT OF 10 with SD Adapter and USB 2.0 MicoSD

Sentiment score: 1

Number of reviews: 1

SanDisk 8GB Class 4 UHS-1 MicroSD HC Memory CardBulk packed in case (1 case per memory card)SanDisk MicroSD to SD Adapter includedListing includes: 10x Memory Card, 10x SD Adapter, 10x plastic case, 1x Card Reader

10 PACK - SanDisk 8GB MicroSD HC Memory Card SDSDQAB-008G (Bulk Packaging) LOT OF 10 with SD Adapter and USB 2.0 MicoSD

Show Reddit reviews

8. GeauxRobot Raspberry Pi 3 Model B 4-layer Dog Bone Stack Clear Case Box Enclosure also for Pi 2B B+ A+ B A

Sentiment score: 1

Number of reviews: 1

Build a 4-layer Raspberry Pi stack caseCompatible with all current Raspberry Pi's including 4B 3 B+, 3 B, 2 B, B+, A+, B and ARaspberry Pi's are not includedIncludes 1 Dog Bone Case and 3 Stackable KitsAround 123mm * 103mm * 28mm after assambling, and interal layer is about 28mm

GeauxRobot Raspberry Pi 3 Model B 4-layer Dog Bone Stack Clear Case Box Enclosure also for Pi 2B B+ A+ B A

Show Reddit reviews

9. Samsung 32GB BAR (METAL) USB 3.0 Flash Drive (MUF-32BA/AM)

Sentiment score: 1

Number of reviews: 1

High-quality metal casing for durability with key ring to prevent lossNAND Flash TechnologyWater proof, Shock proof, magnet proof, temp proof, and X-ray proof with a 5-year warrantyHigh-speed USB 3.0 flash drive for fast data transferOperating temperature is 0 to 60 degree celsius. Non-operating tem...

Samsung 32GB BAR (METAL) USB 3.0 Flash Drive (MUF-32BA/AM)

Show Reddit reviews

Top comments that mention products on r/hadoop:

u/recitegod · 2 pointsr/hadoop

Class 10 but it does not really matter. Once you run them 24/7 for 12 weeks, they get physically corrupted on the block. Not fun (you throw them away). I got the good ones, a combination of samsung, memorex, sony, patriot, they all fail at some point. I would highly recommend to create a usb mounting point like so: https://www.raspberrypi.org/blog/pi-3-booting-part-i-usb-mass-storage-boot/

You don't build such cluster to achieve anything but learning. You stretch every aspect of this cluster design (power, eth0, throughput, writing on the card, interface, everything) and you will definitely have so "gotcha" moment as you troubleshoot along the way. Highly recommend. Building a 5 nodes cluster is better than 3. Nine nodes are overkill but still fun.

If I had to redo it all over the gain, I would get a 4gb pack:
https://www.amazon.com/10-PACK-SanDisk-SDSDQAB-008G-Packaging/dp/B00MHZ70KO/ref=sr_1_6?ie=UTF8&amp;qid=1481181417&amp;sr=8-6&amp;keywords=8gb+micro+sd+card+pack

and x9
https://www.amazon.com/Samsung-METAL-Flash-MUF-32BA-AM/dp/B013CCTM2E/ref=sr_1_fkmr2_2?ie=UTF8&amp;qid=1481181453&amp;sr=8-2-fkmr2&amp;keywords=usb%2Bkey%2Bsamsung%2Busb3&amp;th=1

I believe I will be able to get a 2mb/sec sustain throughput on Spark with this sort of setup, not sure, possibly more. Got to try. Why the usb3? I image these on the laptop so it is much faster than USB2. If not, a USB2 keys will suffice. More than 32gb of storage per node is useless.

FUCKYOUINYOURFACE is right, even with this setup, it is a slow as shit cluster, but you absolutely control this environment in every dimensions you can think of. And that is awesome in my book.

u/batoure · 3 pointsr/hadoop

So its been my experience that because the meta ecosystem of hadoop evolves weirdly sometimes at breakneck speeds I have yet to find one or two books to rule them all.

That being said start with "Hadoop Security" by Ben Spivey.

https://www.amazon.com/Hadoop-Security-Protecting-Your-Platform/dp/1491900989/ref=sr_1_1?ie=UTF8&amp;qid=1498101555&amp;sr=8-1&amp;keywords=hadoop+security

One of the blindspots that 80% of clusters I have interacted with share, is securing the ecosystem. Have you ever seen the interview where Berners-Lee says that it terrifies him that most of the systems on the international space station run over http. Similar situation here. Most Bigdata platforms have a Trust as a default and require strategy and intelligent thought up-front for proper configuration.

Monitoring and managing is where things get complicated management is essentially what the various flavors of hadoop are arguing about. The two majors being Cloudera with "Cloudera Manager" where HortonWorks has "Apache Ambari".

This isn't a plug per-say but I have found that though it often feels like a frankenstein's monster of knitted together tools; Cloudera Manager seems to provide a slightly less painful install process. Additionally their parcels system allows you to manage and mess around with various versions of tools pretty easily. At the end of the day none of these tools are going to hold your hand so expect one of two things to happen.

-You get the cluster up fast and working-ish in "Lets please not call it production mode"
-You get a clean and stable cluster running and start wondering what data to ingest after working on this steadily for the next 3 months or so.

The danger of option 1 is just how achievable it is because a couple months from now the company will start talking about this vital infrastructure and you will have a problem on your hands.

u/cleric04 · 1 pointr/hadoop

If you have an O'reilly subscription there is a pretty good video with Ben and a few other Cloudera guys thats probably a good quickstart on cluster security A Practitioner's Guide to Securing Hadoop

Hadoop Operations is a good book but getting a bit outdated. I'd say just start setting up a cluster and maybe read it if you have some free time.

u/wolf2600 · 1 pointr/hadoop

Don't really want a plug & play solution, I'd like to mimic as closely as possible starting with ~3 Linux boxes, configuring them, then installing and configuring Hortonworks on them.

End goal would be the Hortonworks HDP Administrator certification, so I need to know how to do everything myself.

oooh...

u/homebeer · 4 pointsr/hadoop

I think this book is helpful.
http://www.amazon.com/Hadoop-Definitive-Guide-Tom-White/dp/1449311520/ref=sr_1_1?ie=UTF8&amp;qid=1418697151&amp;sr=8-1

Cloudera usually has some good docs/blogs. Try poking around there.
http://blog.cloudera.com/blog/category/hadoop/

u/MrFromEurope · 2 pointsr/hadoop

Well that sucks. I already have this one:Hadoop Guide

There is a whole chapter on HBase in there but nothing else.

u/e13e7 · 2 pointsr/hadoop

1: You don't need big guns for NN/JT on a small cluster, but I assume you are running something on that box anyway to put it to use.
2: Yes for config, ehh for settings butyouknewthatsorryforpointingitoutlikeyoudidnt.... ._. still, be weary of how multi-threading is handled
3: I didn't, I was an intern for all this - and my DBA mentor hated the whole deal, and long term its not how you wanna do it, especially if what you're doing may or may not be breaking some privacy policies... Some of the cheaper hardware I was looking at rounded in at $6k for 8 nodes with 32 intel cores and 64G ram total. I doubt you need that power, but that's what I was considering to chomp through 5-10G raw data per day on a timely basis, given that up to 100 pig queries with multiple joins were the task at hand.

500G of input with group, join, join, order (about 4 maps / 3 reduces per job) took over 3 hours for 3 nodes w/(4cores..6Gmem..1Tdisk as /tmp and /data). Hosting some of the files on mem and not writing to /tmp really cut down on that replicated joins, but I could only do it in rare circumstances, 6G really isn't a lot @_@

Top products from r/hadoop

1. Hadoop: The Definitive Guide

2. Hadoop Operations

3. Hadoop Security: Protecting Your Big Data Platform

4. Hadoop: The Definitive Guide: Storage and Analysis at Internet Scale

5. Kingston Technology ValueRAM 8GB Kit (2x4GB) DDR3 1333 MHz DIMM Desktop Server Memory KVR1333D3E9SK2/8G

6. Intel Xeon Quad-Core Processor E3-1230 v2 3.3GHz 8MB LGA 1155 CPU LGA BX80637E31230V2

7. 10 PACK - SanDisk 8GB MicroSD HC Memory Card SDSDQAB-008G (Bulk Packaging) LOT OF 10 with SD Adapter and USB 2.0 MicoSD

8. GeauxRobot Raspberry Pi 3 Model B 4-layer Dog Bone Stack Clear Case Box Enclosure also for Pi 2B B+ A+ B A

9. Samsung 32GB BAR (METAL) USB 3.0 Flash Drive (MUF-32BA/AM)

Top comments that mention products on r/hadoop: