Computers & technology books>Books>Computer science books>AI & machine learning books

Best natural language processing books according to redditors

We found 73 Reddit comments discussing the best natural language processing books. We ranked the 30 resulting products by number of redditors who mentioned them. Here are the top 20.

1. Speech and Language Processing, 2nd Edition

14 mentions

Read Reddit comments View price

2. Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit

6 mentions

O Reilly Media

Read Reddit comments View price

3. The Cathedral & the Bazaar: Musings on Linux and Open Source by an Accidental Revolutionary

6 mentions

O Reilly Media

Read Reddit comments View price

9. The Cathedral & the Bazaar: Musings on Linux and Open Source by an Accidental Revolutionary

2 mentions

Read Reddit comments View price

10. Linguistic Fundamentals for Natural Language Processing: 100 Essentials from Morphology and Syntax (Synthesis Lectures on Human Language Technologies)

2 mentions

Linguistic Fundamentals for Natural Language Processing: 100 Essentials from Morphology and Syntax (Synthesis Lectures on Human Language Technologies)

Read Reddit comments View price

11. Parsing Techniques: A Practical Guide (Monographs in Computer Science)

2 mentions

Read Reddit comments View price

12. Machine Ethics

2 mentions

Read Reddit comments View price

13. Spoken Language Processing: A Guide to Theory, Algorithm and System Development

2 mentions

Read Reddit comments View price

14. Logics and Falsifications: A New Perspective on Constructivist Semantics (Trends in Logic (40))

1 mention

Read Reddit comments View price

15. Mathematical Linguistics (Advanced Information and Knowledge Processing)

1 mention

Read Reddit comments View price

16. Understanding Japanese Information Processing

1 mention

Used Book in Good Condition

Read Reddit comments View price

17. The Handbook of Computational Linguistics and Natural Language Processing

1 mention

Used Book in Good Condition

Read Reddit comments View price

18. Human and Machine Hearing: Extracting Meaning from Sound

1 mention

Read Reddit comments View price

19. Prolog and Natural-Language Analysis (Center for the Study of Language and Information Publication Lecture Notes)

1 mention

Used Book in Good Condition

Read Reddit comments View price

20. Modeling and Reasoning with Bayesian Networks

1 mention

Used Book in Good Condition

Read Reddit comments View price

Top Reddit comments about Natural Language Processing:

u/ixampl · 15 pointsr/compsci

I think the field you are looking for is called Natural Language Processing.
There is a nice introductory lecture on it on coursera.

I think this is the standard introduction book.

u/AchillesDev · 12 pointsr/cscareerquestions

Hey I have no CS degree and started out knowing nothing about ML! I do a mix of data science and data engineering, my internal clients are data scientists (and biologists!), and my boss is a data scientist. I build applications around models, build some models and applications around them, etc.

A few things that have been helpful:

Google's ML Crash Course
AWS Machine Learning Full Course
Introduction to Machine Learning with Python

I have a ton of other books (and a lot of deep learning, but most of the frameworks make it easy peasy), but really after doing the Google crash course, the most helpful thing has been building and training models. You'll need to get familiar with tools like Jupyter notebooks, packages like Pandas, frameworks like TensorFlow, scikit-learn, etc. and maybe some data visualization stuff. I like to use AWS SageMaker, since you can spin up whatever resources you need for training and deployment, and has easy access to other AWS resources like S3.

u/Odds-Bodkins · 9 pointsr/KotakuInAction

The four-valued one that that I had in mind was Belnap's relevance logic which is outlined in this Stanford article here.

I've never studied 4-valued logics, but I have studied the Łukasiewicz 3-valued logic (just above Belnap's on that page) and I guess it's similar in principle - we just accept that statements can be True, False, or Neither. "Neither" could be cashed out as "Undefined Truth Value". Ł_3 is used by Kripke in his Outline of a Theory of Truth to ascribe a truth-value to those "ungrounded" statements in a construction which aims to circumvent the Liar paradox (and kinda does, at a cost). The four-valued one is similar, we just have a fourth truth-value. I believe both of these logics have been applied in computer science.

The important thing is that these are perfectly reasonable mathematical/logical constructions - whether you think the Law of Excluded Middle holds in some "true logic of the universe" sense, and these are just bastardisations of classical logic, is another thing.

If you want a book, I'd suggest Greg Restall's Introduction to Substructural Logics.

For a fairly high-level modern primer on intuitionistic logic, I'd try Kapsner's book.

And if you're interested in the current research on homotopy type theory (which has "built-in" intuitionistic logic), the main resource is the Univalent Foundations book which is available as a regularly updated PDF, completely gratis. Feel free to donate though :).

Edit: I misremembered, Kripke actually uses Strong-Kleene 3-valued logic, which is almost the same as Ł_3 but has a slightly different IF-THEN rule.

u/boxstabber · 5 pointsr/LanguageTechnology

This is the standard introductory book to the field: http://www.amazon.co.uk/Language-Processing-Prentice-Artificial-Intelligence/dp/0131873210

u/mredding · 5 pointsr/compsci

I can't speak of a specific book that is a comprehensive history of computing, but I will speak to books that speak of our culture, our myths, and our hero's.

Hackers and Painters, by Paul Graham. People are polarized about the man, whether he's too "pie in the sky" - full of shit and ego, or if he speaks as an ambassador to our most optimistic ideals of our (comp-sci) culture. The contents of this book is a collection of his essays that are inspirational. It made me forego the societal pressures within our culture and reject popular opinion because it is merely popular and just an opinion, which is a virtue no matter who you are, where you are, or what you do. All these essays are on his website, though. If you want to review them, I recommend Hackers and Painters (the essay), What You Can't Say, Why Nerds are Unpopular, and The Age of the Essay; his oldest essays are at the bottom of the page and go up - he writes about what he's thinking about or working on at the time, so you'll see the subject matter change over time. So much of this will have direct application to his middle school and high school life. I cannot recommend this book, and the rest of his essays, enough.

If he wants to get into programming, I recommend The Pragmatic Programmer. This book talks about the software development process. I'm not going to lie, I don't know when best to introduce this book to him. It's not a hard read whatsoever, but it's abstract. I read it in college in my first months and said, "Ok," and put it down. Approaching the end of college and my first couple years in my profession, I would reread it ever 6 months. It's a kind of book that doesn't mean anything, really, without experience, without having to live it, when he has an obligation to his craft, his profession. I tell you about this one since you're asking about books to tell him, because this isn't something someone would normally come up across without being told about it.

The Cathedral and the Bazaar is a telling book about the cultural differences between the proprietary monoliths like Apple and Microsoft, and the Free and Open Source Software communities that back such popular software as Linux (the most popular operating system on the planet, running on all top 500 super computers, most server computers on the internet, and all Android phones) and Chrome(the worlds most popular web browser). Indeed, this book directly reflects the huge cultural battle that was duked out in the field, in the industry, and in the courts from the mid-90s and into the 2000s. It advocates helping the community, contributing to something larger than yourself, and that none of us are as good as all of us. To paraphrase Linus Torvalds(inventor of Linux) "Given enough eyeballs, all bugs are shallow."

It's important to know who the hero's are in our culture, and they are diverse and varied, they're not just computer scientists, but mathematicians, physicists, philosophers, science fiction writers, and more. I would find a good book on Nicola Tesla, since he invented basically everything anyway (Thomas Edison was a great businessman, but a bit of a tosser), Richard Feynman was a physicist who is still celebrated in his field, and he even worked for Thinking Machines, back in the day, which was a marvel of it's time. Seymour Cray founded Cray Supercomputers and they have a lasting legacy in the field, a biography on that would be interesting. A biography on Symbolics and their Lisp Machines will make him yearn to find one still functioning (a rare gem that crops up every now and again, though he can run one in an emulator), and about the "AI Winter", a significant historic era (note: the AI Winter is over, and we are in spring, the history is both compelling and enthralling). Anything Issac Asimov published (in nearly every category of the dewy decimal system) is also compelling, and hardly dated. In fact, he's the originator of a lot of modern sci-fi. Charles Babbage invented the modern computer (though it was entirely mechanical in his day, and it wasn't actually built until 1996-2002) and Ada Lovelace was the worlds first computer programmer. A woman! Speaking of women, and it's worth young men learning this about our history, Grace Hopper was a military computer engineer who invented the term "bug".

And speaking of women, someone I have respect for, especially if your boy wants to get into game development is Sheri Graner Ray's Gender Inclusive Game Design, which may be more appropriate when he's in high school, and I consider it required reading for anyone who wants to enter the gaming industry. The book lays out plainly how video games hyper-sexualize both women, and, for some reason surprisingly to many - men, it's disastrous effects it has for the game industry, the game market, and the gaming community, and insights on how we may combat it. I have seen colleagues (men) become indignant and personally offended at reading this book, but were absolutely humbled when they took the fight to Sheri directly (we had a few phone interviews with her, always fantastic). If your boy found a problem with this book, he would do well to read Paul Grahams essay on keeping his identity small... The subject matter is not a personal attack on the individual, but on the blight, and he would be better served finding himself on the right side of history with this one, it would serve him well if he were to pursue this craft, specifically, but also any forward facing media in general.

And I also recommend some good books on math. Algebra, linear algebra, calculus, and statistics. You can get very far, lead an entire career unto retirement without knowing anything more than arithmetic and basic, basic algebra, but he could only serve himself well if he makes the decision that he is going to like maths and chooses to willfully become good at it. Outside the context of school and terrible teachers, it's actually an enthralling subject. Just get him a copy of Flatland, Flatterland, and Sphereland. Try this. There are books about proofs that break them down into laymen terms so that anyone can celebrate how special they are. My wife has a few on the shelf and I can't remember their titles off hand. Also this, the book is the narrative of some witty laymen who discover a whole branch of mathematics from first principles, the surreal numbers, an extension of imaginary numbers. It's really quite good, but might keep him occupied for a couple years in high school.

I should stop here or I never will.

u/slashcom · 5 pointsr/compsci

In Natural Language Processing, it's Jurafsky and Martin. In Machine Learning, it's debatably the Bishop book.

u/hapagolucky · 5 pointsr/MachineLearning

Start with Jurafsky and Martin to get a rounded overview of the main problems and approaches. I don't use NLTK myself, but it has a large community around it and some decent tutorials I hear.

u/UmamiTofu · 4 pointsr/askphilosophy

>It seems like all the research involving AI alignment seems to be done by computer scientists using machine learning.

Not exactly. Most research here doesn't use machine learning, and much of it looks at issues which are simply above and beyond the question of how an agent is going to learn a classifier function or approximate its value function. That being said, it is largely a matter of computer science in general.

>What role do philosophers have in this conversation?

If decision theorists count as philosophers then there is plenty of work to be done; see Stuart Armstrong's work on corrigibility, Jessica Taylor's work on quantilizers, and Nate Soares, Eliezer Yudkowsky and Wei Dai's work on Functional Decision Theory and its predecessors TDT and UDT. It's worth noting though that it seems better to approach this from a mathematical or computer science background rather than from philosophy if you are doing it for the purposes of advanced AI development.

You can get into more traditional philosophy territory by analyzing about how a superintelligent agent will make decisions and act, as long as you don't get carried away from computational reality. The orthogonality thesis in particular is amenable to philosophical analysis. Here are a couple relevant papers, one from a computer scientist and one from a philosopher.

https://selfawaresystems.files.wordpress.com/2008/01/ai_drives_final.pdf

https://philpapers.org/archive/PETSAS-12.pdf

Finally there is the basic ethical question of what ends advanced AI should achieve, which is a clearly philosophical question. You could technically call this machine ethics, but it is separate from other such work (described in the next part of this comment) in that it assumes very advanced systems. Here are examples of the kinds of ideas at stake:

https://intelligence.org/files/CEV-MachineEthics.pdf

https://intelligence.org/files/CEV.pdf

https://foundational-research.org/wp-content/uploads/2016/08/Suffering-focused-AI-safety.pdf

>Furthermore, what other subfields of AI ethics are there besides AI Alignment?

Machine ethics, which is the question of how AI agents should behave, under the premise of them having human or subhuman level of general intelligence. There is plenty of this in r/AIethics. Wallach and Allen's Moral Machines is a good book here, also this recently came out but I haven't read it. Also there are more papers here. Actual implementation is usually a big part of it these ideas.

Then there is the question of when and how AI interests should be given moral weight. Some interesting stuff here would be:

http://stevepetersen.net/petersen-designing-people.pdf

https://arxiv.org/pdf/1410.8233.pdf

Then, there are arguments about whether it is morally wrong to use artificially intelligent weapons. There is some philosophical literature on when data science and machine learning classifiers and recommenders are fair or unfair. And if you go into legal and political philosophy you could make judgements regarding the rules and policies for developing and using AI.

u/marvin_sirius · 4 pointsr/perl

Using foreach over an array index would generally be considered non-idiomatic or "not very Perly". Sometimes you really need that index though. At least you didn't do a C-style for (my $i = 0;....

Using the references directly should be faster in theory. In reality, probably doesn't matter much.

EDIT: BTW, following the style of your co-worker has merit too. If you want to learn more about Perl style, checkout Damian Conway's Perl Best Practices.

u/mhatt · 4 pointsr/compsci

I would repeat jbu311's point that your interests are way too broad. If you're interested in going into depth in anything, you'll have to pick a topic. Even the ones you mentioned here are fairly broad (and I'm not sure what you meant about concurrency and parallelization "underscoring" AI?).

If you want to learn about the field of natural language processing, which is a subfield of AI, I would suggest Jurafsky and Martin's new book. If you're interested more broadly in AI and can't pick a topic, you might want to check out Russell & Norvig (although you might also want to wait a few months for the third edition).

u/RB-D · 4 pointsr/datascience

Soeech and Language Processing is often considered to be a good introductory text to NLP regardless of which side you come from (linguistics or maths/CS), and thus should provide enough information about linguistic theory to be sufficient for doing most of the standard NLP tasks.

If you would prefer a pure linguistics book, there are many good options available. Contemporary Linguistic Analysis is a solid introductory textbook used in intro ling classes (and have used it myself to teach before).

You might also wish to read something more specific depending on what kind of language processing you end focusing on, but I think a general fundamental understanding of ideas in linguistics would help a lot. Indeed as you are probably aware, less and less of modern NLP uses ideas from linguistics in favour of data-driven approaches, so having a substantial linguistics background is often not necessary.

Sorry for only having a small number of examples - just the first two that came to my head. Let me know if you would like some more options and I can see what else I can think of.

Edit: missed some words

u/Buckwheat469 · 4 pointsr/opensource

I suggest reading the Cathedral and the Bazaar by Eric S Raymond.

u/Mauss22 · 4 pointsr/askphilosophy

I'll pass along wokeupabug's typical recommendations:

>A good broad introduction is Lowe's An Introduction to the Philosophy of Mind (for a broader, philosophy and cognition sort of approach). For an introduction more focused on the mind-body problem, you have lots of options; Kim's Philosophy of Mind and Heil's Philosophy of Mind... are good choices. For a history anthology approach, the Chalmers' Philosophy of Mind... is a good choice; a little more accessible would be Morton's Historical Introduction to the Philosophy of Mind.

And the recommendation from the FAQ page:

>For philosophy of mind, Searle's Mind: A Brief Introduction.

I don't really know what you mean by a 'consideration of the future'. Do you mean issues that could crop up in the future germane to phil. mind (A.I., cog. enhancement, etc.)? If so, that's a tough one! Likely just the Cambridge Handbook. The introduciton is avail here if you'd like a preview. And this book on Machine Ethics is recommended on the PhilPapers bibliography.

&#x200B;

u/ninjin · 3 pointsr/MachineLearning

[Manning and Schuetze](
http://www.amazon.com/Foundations-Statistical-Natural-Language-Processing/dp/0262133601) is also worth a mention even if I personally prefer the style of Jurafsky and Martin and that is slightly less dated.

Just to put NLTK into some sort of a frame. As far as I know, no researcher publishes anything using NLTK. At least two years ago or so when I had a look at the NLTK book what it does introduce is essentially the state-of-the-art from the mid-90;s which is just fine for an introductory course and playing around but not how you do things in research. This is not an attack on NLTK, just more of pointing out what it actually is.

u/[deleted] · 3 pointsr/programming

Also, this is the standard textbook on speech recognition.

To address submitter's problem, if you want speech rec as a tool instead of a research problem, you're almost certainly going to be better off using Dragon or Microsoft than trying to train your own Sphinx/HTK system.

u/Milleuros · 3 pointsr/trendingsubreddits

I can suggest having a look at r/DataScience , they seem to be focused on how to become an actual data scientist and get a job with that.

Machine learning is a tool by the way, which you generally learn while doing other things. I'm personally using ML in the framework of my PhD in Physics. I'll most probably be eligible for jobs as data scientists afterwards. I do know a lot of maths, which is useful to understand deep down what is going on.

Of course there are self-taught data scientists and analysts. I know some people started by e.g. reading around on the web (there are a lot of blogs, open source code, ...) and then participating to competitions on Kagle.

I will make some advertisement for MachineLearningMastery.com, because that blog was very helpful when I started. It's a blog that proposes to learn ML in a top-down approach: start by coding and practising, understand later. And also this book, which you might be able to find on the internet. For people more into theory and who want to see the maths behind it, a 800 pages book on deep learning

(At that point I'm just throwing infos and links in case anyone is interested)

u/cyorir · 3 pointsr/paradoxpolitics

Have you heard of this thing called Natural Language Processing?

You too can learn how to use NLP to analyze text quickly with computers. Start by reading a book like this or this, then solve practice problems like these.

You, too, can learn how to process a corpus of 650,000 emails in 8 days!

u/drknowledge · 3 pointsr/linux

The Cathedral & the Bazaar

u/formantzero · 3 pointsr/linguistics

From what I understand, programs like the University of Arizona's Master of Science in Human Language Technology have pretty good job placement records, and a lot of NLP industry jobs seem to bring in good money, so I don't think it would be a bad idea if it's something you're interested in.

As for books, one of the canonical texts in NLP seems to be Jurafsky and Martin's Speech and Language Processing. It's written in such a way as to serve as an intro to computer science for linguists and as an intro to linguistics for computer scientists.

It's nearing being 10 years old, so some more modern approaches, especially neural networks, aren't really covered, iirc (I don't have my copy with me here to check).

Really, it's a pretty nice textbook, and I think it can be had fairly cheap if you can find an international version.

u/kherux · 2 pointsr/grammar

If I can't find the topic in the first two, I can usually find it in the third:

u/videoj · 2 pointsr/MachineLearning

O'Reilly has published a number of practical machine learning books such as Programming Collective Intelligence: Building Smart Web 2.0 Applications and Natural Language Processing with Python that you might find good starting points.

u/thuvh · 2 pointsr/LanguageTechnology

Check it out: http://www.amazon.com/Foundations-Statistical-Natural-Language-Processing/dp/0262133601. I've only read a part of this book, but i think this is good book.

u/sudoatx · 2 pointsr/linux

"The art of Unix Programming" by Eric S. Raymond. Not as intimidating or outdated as you might think - This book goes over the history and philosophical concepts behind not only Unix (and Linux), but also the Open Source initiative behind Linux. ESR's other work is "The Cathedral & the Bazaar" which is also worth a look, but I would argue is dated now, as much of what was suggested in this book has already come to pass with too many real world examples to mention.

u/gogromat · 2 pointsr/Anarchism

What I was thinking recently was to ask/suggest forming a board/forum where everyone can discuss these issues.

I am a developer, and as you may know a lot of software is open sourced. Apache project for example has created a multitude of products made by volunteer contributions. Valve corporation internally is organized as anarcho-syndicalism (You may recognize the author of this blog, Yanis Varoufakis, a current Finance Minister of Greece (newly elected left-wing government party)). A lot of books that talk about openness, and sharing of ideas have been turned into software books, like The Cathedral and the Bazaar. A lot of software projects breathe anarchy.

My point is software developers have a great open platforms where everyone can make projects, submit issues and form discussions. For example the Github is such a platform.

Without going into long detail, my suggestion was going to be making a project on Github that is well-organized and supported by anarchist community. It would consist of main points of how/what can/should be done in plain, Simple English.

This will make it highly accessible, easy-to-replicate (In software terms "forking" - on github you can easily copy the project for yourself, store it on any number of computers that you have, all done securely).

Inside it can give key insights/ideas of anarchists that are easily to modify and submit new/alternative strategies for each entry. Like you can even make pages about "revolution" and steps that are necessary/advised for people to take.

Of course it takes a little time to get used to using this platform, but we can even take it step further and create something even simpler out of it.
I don't know how many developers are on this subreddit, but even a small number is enough to keep track of this project.
I think it can be a very successful endeavour. Comments/Questions/Suggestions?

u/breakz · 2 pointsr/MachineLearning

Sarah Guido from Bitly and NYC Python is working on exactly this book:

http://www.amazon.com/Introduction-Machine-Learning-Python-Sarah/dp/1449369413

u/bmcnett · 2 pointsr/LearnJapanese

Back when I first studied Japanese in 1991, times were a little different. I started from zero knowledge, and took a class at the local Japanese mall for 1.5 hours a week, for about 2 years, to get to what I guess you'd call N3 level today. Didn't really study per se, just set my mind to "full retention" whenever I was around the school.

At the time, an "English" computer wouldn't even display Japanese, so I got involved in development of software that bridged the gap, and even got written up in a book about the subject in 1993 (which says more about the sorry state of internationalization at the time, than it does about me personally) https://www.amazon.com/Understanding-Japanese-Information-Processing-Lunde/dp/1565920430

Later transferred to New York University's Japanese program, which didn't do much for me. It was about 25 people to a class, and more than half of them spoke Japanese as kids, and took the class for the easy grade.

Been hovering at about N2 level now for about 25 years, but thanks to the encouragement of a coworker, started getting back into studying. More like, really studying for the first time. Using Anki during my daily 3 hour commute, with a Bluetooth puck on the steering wheel.

Back in the early 90s, I remember there being no expectation, really, that a person would learn Japanese as a language, in the way that a person would learn French or Spanish. At the first school I went to, there were students who had been studying there for five years, who couldn't really swing a basic conversation about daily life yet. I was progressing at a rate that you'd consider normal today, but they treated me like I was some kind of miracle student. In retrospect, this was pretty strange.

u/monkeyvselephant · 2 pointsr/programming

Read this afterwards. If you're going to be a programmer, you should start learning best practices as soon as possible. These are all just suggestions, but they're pretty good ones.

u/aabbccaabbcc · 2 pointsr/linguistics

The NLTK book is a good hands-on free introduction that doesn't require you to understand a whole lot of math.

Other than that, the "big two" textbooks are:

Speech and Language Processing by Jurafsky & Martin (there are some chapter drafts from the upcoming 3rd edition here!
Foundations of Statistical Natural Language Processing by Manning & Schutze

They're a lot more math-heavy than the NLTK book, but if you're interested in computational linguistics/NLP, then your math and statistics should be solid.

Honorable mention:
Linguistic Fundamentals for Natural Language Processing by Emily Bender aims to give an overview of stuff from pen-and-paper linguistics that are good to know in the computational linguistics world.

Hope this helps!

u/tpederse · 2 pointsr/LanguageTechnology

I always thought this was a pretty good introduction to UIMA.

http://www.morganclaypool.com/doi/abs/10.2200/S00194ED1V01Y200905HLT003

It presumes you know a bit about NLP already, and for that Jurafsky and Martin is a great place to start.

http://www.amazon.com/Speech-Language-Processing-2nd-Edition/dp/0131873210

There are some very nice video lectures from Chris Manning and Dan Jurafsky as well :

https://www.youtube.com/playlist?list=PLSdqH1MKcUS7_bdyrIl626KsJoVWmS7Fk

u/chorankates · 2 pointsr/perl

Perl::Critic is awesome, highly recommend pairing it with Perl Best Practices. i can never remember which came first, but this book tackles the pros and cons of many of the critiques very well.

u/AndrewKemendo · 2 pointsr/statistics

Darwiche - Modeling and Reasoning with Bayesian Statistics.

Excellent book which walks you through the foundations and mechanics of Bayesian inference and building bayes nets. It has everything you need to build your own models to include algorithms which are easily translatable into software code and has a great glossary and set of appendices.

u/bfoz · 2 pointsr/ProgrammingLanguages

Sorry, I forgot to answer your question. I'm bored with the day job so I've been playing around with parsers and recently read "Parsing Techniques". So I'm the proverbial newbie in the space. Does that count as a use case?

u/nihalnayak · 2 pointsr/LanguageTechnology

Not much of linguistics knowledge is required to get started with NLP. However, if you wish to learn about linguistics before starting with NLP, I'd recommend this book - Linguistic Fundamentals for Natural Language Processing: 100 Essentials from Morphology and Syntax by Emily M. Bender.

&#x200B;

u/kolrehs · 2 pointsr/learnprogramming

there is this o'reilly book. It has lots of examples, which might be good for picking up python as you go, but I don't know for sure b/c i haven't read it.

u/tetramarek · 2 pointsr/compsci

I watched the entire course of Data Structures and Algorithms by Richard Buckland (UNSW) and thought it was excellent.
http://www.youtube.com/playlist?list=PLE621E25B3BF8B9D1

There is also an online course by Tim Roughgarden (Stanford) currently going on. It's very good but I don't know if you can still sign up.
https://class.coursera.org/algo

Topcoder.com is a fun place to test your skills in a competitive environment.

That being said, based on the description you are interested in things which don't usually fit into algorithms books or courses. Instead, you might want to look into machine learning and maybe even NLP. For example Pattern Recognition and Machine Learning by Bishop and Foundations of Statistical Natural Language Processing by Manning & Schuetze are great books for that.

u/ccc123ccc · 1 pointr/programming

I've seen it on page 5 of Perl Best Practices, but I don't know if they lifted it from somewhere else.

u/admiralwaffles · 1 pointr/bigdata

Glad it's useful! I'll copy some of a reply I gave to somebody who PM'd me about advice for a data science career, because it's pertinent to you:

You need to understand where you want to go -- more science-y or more business-y. See, science-y type of analytics are heavy on the stats, applying really advanced methods to glean some counterintiutive and/or non-obvious insight. Business-y type stuff is digging through the data to understand what it's telling you and to build a bit of a story to figure out what the business is doing, and then measuring success after something changes. Both have their value. Essentially: science side tells you about the data, but the business side tells you how to make decisions based on the data. You'll fall somewhere on that spectrum, so just play to your strengths.

Once you've determined this, you need to learn a few things:

Excel. Excel is the greatest data tool ever invented and I'll fight anybody who says different. Learn all about formulas, pivot tables, and whatnot. Excel is so deep, but really understand that Data tab. There is no better tool to connect to your data and just play around with it to figure out what you're looking at.
Python, specifically NumPy and Pandas. Those are the two modules that will let you play with data very quickly. Pandas puts data in tables and allows you to operate on them. NumPy handles very complex calculations very quickly. Also, learn about Jupyter notebooks -- they're wonderful when doing analysis.
Business! You may already know this, but you need to understand what the analytics are looking for. Data's only value is to use it to make better decisions. That's it. Data has no inherent value, and you need to understand how it can be leveraged. Even if you're more interested in the science of it, you still should have some grounding in how data is used.
Bonus: GIS stuff. QGIS is a good tool and it's free. You can also do a ton of GIS stuff in Python with Shapely and Matplotlib (honestly, like 85% of my GIS work is in Python). This is especially helpful if you have some really interesting things geographically. Just be cognizant that every geographic insight isn't useful.

As for some resources, here are some courses I think would be good from MIT CourseWare (full disclosure: I haven't sat through these specific courses, but these are the topics that are important):
Statistical Thinking and Data Analysis
Data, Models, and Decisions
Communicating with Data (Little dated, but still valuable)

You may also want to read up on machine learning. I like the O'Reilly book on it, but there are tons of books out there about it now.

Hope that helps!

u/onepraveen · 1 pointr/learnpython

Thanks for the suggestion. Is below book good for beginning....

https://www.amazon.com/Introduction-Machine-Learning-Python-Scientists/dp/1449369413#

u/bartleby485 · 1 pointr/audiophile

Human and Machine Hearing: Extracting Meaning from Sound https://www.amazon.com/dp/1107007534/ref=cm_sw_r_cp_api_i_zNY.AbG5EZJTC

u/my_work_account_shh · 1 pointr/speechprocessing

Which toolkit are you using for your HMMs? The HTK book has some general steps on what to do when it comes to HMM-base ASR. You might also want to have a look at the Speech Recognition chapter in Jurafsky and Martin's Speech and Language Processing, if you can find it online or in a library.

That being said, the state-of-the-art for ASR is mostly DNNs. HMMs are being phased out quite quickly as the main acoustic models in most speech applications. If you're interested in speech, why not start with those?

u/iseedoug · 1 pointr/TechProductReviews

If anyone is interested, this the book that really set the tone for the open source movement and the vast amount of trust put in open source software.

http://www.amazon.com/gp/product/B0026OR3LM?btkr=1

u/as4nt · 1 pointr/italy

Di editori ce ne sono diversi, se cerchi un'introduzione alla PNL con un approccio accademico, ti consiglio: Speech And Language Processing: An Introduction to Natural Language Processing , Computational Linguistics, and Speech Recognition.

Alternativamente, Natural Language Processing with Python .

u/wawawawa · 1 pointr/perl

I'd be really genuinely interested to see a source for your comment regarding "-pbp" being out of favour. Do you have a link? I always thought thedamian was incredibly well respected in the Perl community. His book Perl Best Pratices is a great read, in my opinion... Actually, his Object Oriented Perl is an amazing read, too.

EDIT: daft spelling error

u/Ac1dRa1n09 · 1 pointr/compling

Well, my advice would be to go to your uni's library and have a look for some books on language technology/computational linguistics. Although I come from the U.S and lived there most of my life, I now live and am doing my master's in Germany and so I unfortunately I don't know much about any English-language books to get you started with an introduction to the field, however, I have heard this book get thrown into a few conversations in classes:

http://www.amazon.com/Handbook-Computational-Linguistics-Language-Processing/dp/1118347188

I am also pretty sure your uni library will have something on it in the linguistics/compsci section. However, if this isn't the case, send me a PM and I will try to help you out. I have a ton of information that I could simply "translate" into English and give to you if you'd like to know where to get started with research for it and such.

u/shawncplus · 1 pointr/PHP

If you're interested in that kind of thing I recommend Parsing Techniques: A Practical Guide by Dick Grune http://www.amazon.com/Parsing-Techniques-Practical-Monographs-Computer/dp/1441919015

u/hobo_law · 1 pointr/LanguageTechnology

Ah, that makes sense. Yup, using any sort of large corpus like that to create a more general document space should help.

I don't know what the best way to visualize the data is. That's actually one of the big challenges with high dimensional vector spaces like this. Once you've got more than three bases you can't really draw it directly. One thing I have played around with is using D3.js to create a force directed graph where the distance between nodes corresponds to the distance between vectors. It wasn't super helpful though. However I just went to look at some D3.js examples and it looks like there's an example of an adjacency matrix here: https://bost.ocks.org/mike/miserables/ I've never used one, but it seems like it could be helpful.

The link seems to working now for me, but if it stops working again here's the book it was taken from: https://www.amazon.com/Speech-Language-Processing-Daniel-Jurafsky/dp/0131873210 googling the title should help you find some relevant PDFs.

u/contentBat · 1 pointr/iosgaming

> That's just what's left after apple

The point is not to end up with the same amount of money. It's to turn the money into legal money with a paper trail. That way you can pay your taxes and keep off the radar of the IRS.

It looks like another avenue that the IRS likely have not caught up to yet. I thought the norm was creating best selling books (see the "also bought" section).

If you want to learn more with an interesting story, just watch Ozarks.

u/Choosing_is_a_sin · 1 pointr/linguistics

Why not start with a book that deals with mathematical approaches to linguistics?

u/plaintxt · 1 pointr/learnpython

Maybe this will help...
https://www.amazon.com/Natural-Language-Processing-Python-Analyzing/dp/0596516495/ref=sr_1_1?ie=UTF8&amp;qid=1500483013&amp;sr=8-1&amp;keywords=python+nlp

u/skibo_ · 1 pointr/compsci

Well, I'm a bit late. But what /u/Liz_Me and /u/robthablob are saying is the same I was taught in NLP classes. DFA (Deterministic Finite Automatons) can be represented as regular expressions and vice versa. I guess you could tokenize without explicitly using either (e.g. split string at whitespace, although I suspect, and please correct me if I'm wrong, that this can also be represented as a DFA). The problem with this approach is that word boundaries don't always match whitespaces (e.g. periods or exclamation marks after last word of sentence). So I'd suggest, if you are working in NLP, that you become very familiar with regular expressions. Not only are they very powerful, but you'll also need to use them for other typical NLP tasks like chunking. Have a look at the chapter dedicated to the topic in Jurafsky and Martin's Speech and Language Processing (one of the standard NLP books) or Mastering Regular Expressions.

u/Megatron_McLargeHuge · 1 pointr/MachineLearning

There are a million details as others have said. You don't know how much you're missing.

This is the book to read for traditional HMM-based ASR.

Ignore the discussion of Baum-Welch. The HMM isn't trained in the normal ways since 1. it's huge, and 2. there's limited data. The transition probabilities come from your language model. The HMM topology is usually to have three states per phone-in-context, and to use a dictionary of pronunciation variants for each word.

Each state has a GMM to model the probabilities of the features. The features are MFCCs of a frame plus deltas and double deltas from the MFCCs of the previous frame. You'll probably use a diagonal covariance matrix.

Remember I said phone-in-context? That's because the actual pronunciation of a phoneme depends on the phonemes around it. You have to learn clusters of these since there are too many contexts to model separately.

Training data: to train, you need alignments of words and their pronunciations to audio frames. This pretty much requires using an existing recognizer to do labeling for you. You give it a restricted language model to force it to recognize what was said and use the resulting alignment as training data.

Extra considerations: how to model silence (voice activity detector), how to handle pauses and "ums" (voiced pauses). How to handle mapping non-verbatim transcripts to how they might have been spoken (how did he say 1024?). How to adapt to individual speakers. How to collapse states of the HMM into a lattice. How to handle backoff from long ngrams to short ones in your language model.

Needless to say, I don't recommend this for a master's thesis.

u/treerex · 1 pointr/csbooks

They're the most recent, and both are excellent. I used to have two copies of Manning and Schütze — one for home and one for work.

Winograd's Language As a Cognitive Process was the first NLP book I owned and I still refer to it once in a while.

Perera and Schieber's Prolog and Natural-Language Analysis is good if you're interested in logic programming and NLP. It's dense though.

Search on Amazon for "natural language processing" and you will find a bunch of books from Springer that were released in the last year or two.

u/fanglet · 1 pointr/linguistics

Hell yes. If you want a slightly less intense introduction to computational linguistics, I'd also recommend Natural Language Processing with Python.

u/Lamez · 1 pointr/linux

I noticed it was online, is there a place where I can get a tangible copy?

Is this the one?

u/artpendegrast · 1 pointr/compsci

This might have some material that will help you out.
Natural Language Processing with Python
It can also be found for free, too.

u/Franko_ricardo · 0 pointsr/learnprogramming

This is good : http://www.amazon.com/Where-Wizards-Stay-Up-Late/dp/0684832674

http://www.amazon.com/Cathedral-Bazaar-Musings-Accidental-Revolutionary-ebook/dp/B0026OR3LM/ref=sr_1_1?s=books&amp;ie=UTF8&amp;qid=1407387705&amp;sr=1-1&amp;keywords=the+cathedral+and+the+bazaar

Both offer a very entertaining view on relevant technology. I like the Cathedral and the Bazaar because its discussion is still quite relevant today.

u/mithaldu · -3 pointsr/programming

Other languages encourage and practice code reuse and sharing of best practices a lot more than PHP. Consider for example: Perl Best Practices or Python PEPs.

Best natural language processing books according to redditors

1. Speech and Language Processing, 2nd Edition

2. Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit

3. The Cathedral & the Bazaar: Musings on Linux and Open Source by an Accidental Revolutionary

4. Introduction to Machine Learning with Python: A Guide for Data Scientists

5. Perl Best Practices

6. Foundations of Statistical Natural Language Processing

7. Encyclopedia of Language and Linguistics

8. Perl Best Practices: Standards and Styles for Developing Maintainable Code

9. The Cathedral & the Bazaar: Musings on Linux and Open Source by an Accidental Revolutionary

10. Linguistic Fundamentals for Natural Language Processing: 100 Essentials from Morphology and Syntax (Synthesis Lectures on Human Language Technologies)

11. Parsing Techniques: A Practical Guide (Monographs in Computer Science)

12. Machine Ethics

13. Spoken Language Processing: A Guide to Theory, Algorithm and System Development

14. Logics and Falsifications: A New Perspective on Constructivist Semantics (Trends in Logic (40))

15. Mathematical Linguistics (Advanced Information and Knowledge Processing)

16. Understanding Japanese Information Processing

17. The Handbook of Computational Linguistics and Natural Language Processing

18. Human and Machine Hearing: Extracting Meaning from Sound

19. Prolog and Natural-Language Analysis (Center for the Study of Language and Information Publication Lecture Notes)

20. Modeling and Reasoning with Bayesian Networks

Top Reddit comments about Natural Language Processing: