We have launched the Professional Document Similarity API on Mashape
We have launched the Professional Document Similarity API on Mashape, which support compare two english text document similarity. You can use our demo on the Document Similarity website: Document Similarity Demo.
Document Similarity API is based on advanced Natural Language Processing and Machine Learning technologies, and it belongs to text analysis and can be used to analysis the semantic similarity (cosine similarity) of two text document that user provided. You can test them here by the demo or you can subscribe the free plan of our Document API on Mashape to test them first. Based by the Mashape Platform, the Document Similarity API can be easily used in any enviroment capable of makeing HTTP requests, including Java/JVM/Android, Node.js, PHP, Python, Objective-C/iOS, Ruby, .NET. If you have any questions or want any customized text analysis services, contact us by email: textanalysisapi@gmail.com
Here is the example from wikipedia, we can compare the “Deep Learning” and “Machine Learning” by our document similarity service.
The text one is from Deep Learning:
Deep learning (deep structural learning or hierarchical learning) is a set of algorithms in machine learning that attempt to model high-level abstractions in data by using model architectures composed of multiple non-linear transformations. Deep learning is part of a broader family of machine learning methods based on learning representations of data. An observation can be represented in many ways such as a vector of intensity values per pixel. Some representations make it easier to learn tasks from examples. Research in this area attempts to make better representations and create models to learn these representations. Various deep learning architectures such as deep neural networks, convolutional deep neural networks, and deep belief networks have been applied to fields like computer vision, automatic speech recognition, natural language processing, and audio recognition where they have been shown to produce state-of-the-art results on various tasks. Alternatively, deep learning has been characterized as a buzzword, or a rebranding of neural networks.
The text two is from Machine Learning:
Machine learning is a scientific discipline that explores the construction and study of algorithms that can learn from data. Such algorithms operate by building a model based on inputs and using that to make predictions or decisions, rather than following only explicitly programmed instructions. Machine learning can be considered a subfield of computer science and statistics. It has strong ties to artificial intelligence and optimization, which deliver methods, theory and application domains to the field. Machine learning is employed in a range of computing tasks where designing and programming explicit, rule-based algorithms is infeasible. Example applications include spam filtering, optical character recognition, search engines and computer vision. Machine learning is sometimes conflated with data mining, although that focuses more on exploratory data analysis. Machine learning and pattern recognition “can be viewed as two facets of the same field.
On our Document Similarity Demo, we get the the result is:
Similarity: 0.8088 (cosine similairty from 0 to 1, 0 means absolute different, 1 means absolute same)
Just enjoy it, if you have any questions or want any customized text analysis services, contact us by email: textanalysisapi@gmail.com
The match between
“Paste or enter two piece of text below to check their similarity” and
“Paste the context of two documents to check their match”
is almost 0 (0.02) – so please, when will this insanity of looking at language as just data – as a bag of characters stop???