Tech Today w/ Ken May

Archive for April 22nd, 2017

Internet Archive to ignore robots.txt directives

Posted by kenmay on April - 22 - 2017

Robots (or spiders, or crawlers) are little computer programs that search engines use to scan and index websites. Robots.txt is a little file placed on webservers to tell search engines what they should and shouldn’t index. The Internet Archive isn’t a search engine, but has historically obeyed exclusion requests from robots.txt files. But it’s changing its mind, because robots.txt is almost always crafted with search engines in mind and rarely reflects the intentions of domain owners when it comes to archiving. Over time we have observed that the robots.txt files that are geared toward search engine crawlers do not necessarily serve our archival purposes. Internet Archive’s goal is to create complete “snapshots” of web pages, including the duplicate content and the large versions of files. We have also seen an upsurge of the use of robots.txt files to remove entire domains from search engines when they transition from a live web site into a parked domain, which has historically also removed the entire domain from view in the Wayback Machine. In other words, a site goes out of business and then the parked domain is “blocked” from search engines and no one can look at the history of that site in the Wayback Machine anymore. We receive inquiries and complaints on these “disappeared” sites almost daily. A few months ago we stopped referring to robots.txt files on U.S. government and military web sites for both crawling and displaying web pages (though we respond to removal requests sent to info@archive.org). As we have moved towards broader access it has not caused problems, which we take as a good sign. We are now looking to do this more broadly. An excellent decision. To be clear, they’re ignoring robots.txt even if you explicitly identify and disallow the Internet Archive. It’s a splendid remember that nothing published on the web is ever meaningfully private, and will always go on your permanent record.

Categories: reader

Today, Munich-based Lilium Aviation conducted the first test flight of its all-electric, two-seater, vertical take-off and landing (VTOL) prototype. “In a video provided by the Munich-based startup, the aircraft can be seen taking off vertically like a helicopter, and then accelerating into forward flight using wing-borne lift, ” reports The Verge. From the report: The craft is powered by 36 separate jet engines mounted on its 10-meter long wings via 12 movable flaps. At take-off, the flaps are pointed downwards to provide vertical lift. And once airborne, the flaps gradually tilt into a horizontal position, providing forward thrust. During the tests, the jet was piloted remotely, but its operators say their first manned flight is close-at-hand. And Lilium claims that its electric battery “consumes around 90 percent less energy than drone-style aircraft, ” enabling the aircraft to achieve a range of 300 kilometers (183 miles) with a maximum cruising speed of 300 kph (183 mph). “It’s the same battery that you can find in any Tesla, ” Nathen told The Verge. “The concept is that we are lifting with our wings as soon as we progress into the air with velocity, which makes our airplane very efficient. Compared to other flights, we have extremely low power consumption.” The plan is to eventually build a 5-passenger version of the jet. Read more of this story at Slashdot.

Categories: reader

107 cancer papers retracted due to peer review fraud

Posted by kenmay on April - 22 - 2017

Enlarge / Pictured: Probably an editor who peer-reviewed stuff for Tumor Biology . (credit: flickr user: 派脆客 Lee ) The journal Tumor Biology is retracting 107 research papers after discovering that the authors faked the peer review process. This isn’t the journal’s first rodeo. Late last year, 58 papers  were retracted from seven different journals—  25 came from  Tumor Biology  for the same reason. It’s possible to fake peer review because authors are often asked to suggest potential reviewers for their own papers. This is done because research subjects are often blindingly niche; a researcher working in a sub-sub-field may be more aware than the journal editor of who is best-placed to assess the work. But some journals go further and request, or allow, authors to submit the contact details of these potential reviewers. If the editor isn’t aware of the potential for a scam, they then merrily send the requests for review out to fake e-mail addresses, often using the names of actual researchers. And at the other end of the fake e-mail address is someone who’s in on the game and happy to send in a friendly review. Read 6 remaining paragraphs | Comments

Categories: reader

Images of Seleznev with stacks of cash were found on his laptop following his 2014 arrest in the Maldives. (credit: Department of Justice ) Russian hacker Roman Seleznev was sentenced to 27 years in prison today. He was convicted of causing more than $169 million in damage by hacking into point-of-sale computers. Seleznev, aka “Track2,” would hack into computers belonging to both small businesses and large financial institutions, according to prosecutors. He was  arrested in the Maldives in 2014 with a laptop that had more than 1.7 million credit card numbers. After an August 2016 trial, Seleznev was convicted on 38 counts, including wire fraud, intentional damage to a protected computer, and aggravated identity theft. The sentence is quite close to the 30 years that the government asked for. Prosecutors said Seleznev deserved the harsh sentence because he was “a pioneer” who helped grow the market for stolen credit card data and because he “became one of the most revered point-of-sale hackers in the criminal underworld.” Read 6 remaining paragraphs | Comments

Categories: reader