According to a survey from Shanghai love
digital reduces communication costs, reduce the cost of acquisition tools, behavior acquisition machine confusing content source to reduce content quality. In the process of collecting, for intentionally or inadvertently, resulting in incomplete acquisition of web content, format or additional confusion emerge in an endless stream of garbage, which has seriously affected the quality of the search results and the user experience. The search engine is the fundamental reason for the original value in order to improve the user experience, the original high-quality original content.
, a search engine why we should pay attention to the original
is large, the meaning and distribution of the HTML tag is different, so extract the key information such as the title, author and the time difference is relatively large degree of difficulty. Do not put in full, and accurate, but also the most timely, when >
reproduced and collection, the diversion of high-quality original site traffic, no longer belongs to the original author’s name out, will directly affect the quality of the original owners and the author proceeds. The enthusiasm of the long term will affect the original, is not conducive to innovation, quality content is not conducive to the creation of new. Encourage high-quality original, encourage innovation, to the original site and author of reasonable flow, so as to promote the prosperity of Internet content, it is an important task of search engine.
2.2 content generator, making false original
1.3 to encourage the original author and article
1.2 to improve the user experience
, two acquisition very cunning, very difficult to identify the original
is difficult to extract structured informationThe site structure differences between different
at present, a large number of websites bulk collection of original content, using the method of artificial or machine, tampering with the author, release time and source of key information, posing as the original. The original is posing as search engines need to identify appropriate adjustment.
, more than 80% of the news and information are collected in artificial reproduction or machine, from newspapers to entertainment news, traditional media website lace from introduction to product evaluation, even the university library has URGE-RETURN information site in the acquisition machine. It can be said that the quality of the original content is surrounded by a vast expanse of water in the ocean in the acquisition of search engines in the sea Amoy millet, is both difficult and challenging.
use article automatic generator tool, "original" an article, then an eye-catching title, now the cost is very low, and has a certain originality. However, the original is to have a social consensus value, rather than making a fundamental barrier at garbage can do high quality original content value. Although the content is unique, but not social consensus value, this kind of pseudo original search engine key identified and inflictive.
1.1 acquisition of flooding
2.1 acquisition posing as the original, tampering with the key information of