Scraper site

A scraper site is a website that copies content from other websites using web scraping. The content is then mirrored with the goal of creating revenue, usually through advertising and sometimes by selling user data. Scraper sites come in various forms. Some provide little, if any material or information, and are intended to obtain user information such as e-mail addresses, to be targeted for spam e-mail. Price aggregation and shopping sites access multiple listings of a product and allow a user to rapidly compare the prices. A scraper site is a website that copies content from other websites using web scraping. The content is then mirrored with the goal of creating revenue, usually through advertising and sometimes by selling user data. Scraper sites come in various forms. Some provide little, if any material or information, and are intended to obtain user information such as e-mail addresses, to be targeted for spam e-mail. Price aggregation and shopping sites access multiple listings of a product and allow a user to rapidly compare the prices. Search engines such as Google could be considered a type of scraper site. Search engines gather content from other websites, save it in their own databases, index it and present the scraped content to their search engine's own users. The majority of content scraped by search engines is copyrighted. The scraping technique has been used on various dating websites as well and they often combine it with facial recognition. Scraping is also used on general image recognition websites, and websites specifically made to identify images of crops with pests and diseases Some scraper sites are created to make money by using advertising programs. In such case, they are called Made for AdSense sites or MFA. This derogatory term refers to websites that have no redeeming value except to lure visitors to the website for the sole purpose of clicking on advertisements. Made for AdSense sites are considered search engine spam that dilute the search results with less-than-satisfactory search results. The scraped content is redundant to that which would be shown by the search engine under normal circumstances, had no MFA website been found in the listings. Some scraper sites link to other sites to improve their search engine ranking through a private blog network. Prior to Google's update to its search algorithm known as Panda, a type of scraper site known as an auto blog was quite common among black hat marketers who used a method known as spamdexing. Scraper sites may violate copyright law. Even taking content from an open content site can be a copyright violation, if done in a way which does not respect the license. For instance, the GNU Free Documentation License (GFDL) and Creative Commons ShareAlike (CC-BY-SA) licenses used on Wikipedia require that a republisher of Wikipedia inform its readers of the conditions on these licenses, and give credit to the original author.

Parent Topic

Child Topic

No Parent Topic