Using Solr to Search and Analyze Logs
Presented by Radu Gheorghe, Software Engineer, , Sematext Group, Inc
Many of us tend to hate or simply ignore logs, and rightfully so: they're typically hard to find, difficult to handle, and are cryptic to the human eye. But can we make logs more valuable and more usable if we index them in Solr, so we can search and run real-time statistics on them? Indeed we can, and in this session you'll learn how to make that happen. In the first part of the session we'll explain why centralized logging is important, what valuable information one can extract from logs, and we'll introduce the leading tools from the logging ecosystems everyone should be aware of - from syslog and log4j to LogStash and Flume. In the second part we'll teach you how to use these tools in tandem with Solr. We'll show how to use Solr in a SolrCloud setup to index large volumes of logs continuously and efficiently. Then, we'll look at how to scale the Solr cluster as your data volume grows. Finally, we'll see how you can parse your unstructured logs and convert them to nicely structured Solr documents suitable for analytical queries.
This site provides links to random videos hosted at YouTube, with the emphasis on random.
The original idea for this site actually stemmed from another idea to provide a way of benchmarking the popularity of a video against the general population of YouTube videos. There are probably sites that do this by now, but there wasn’t when we started out. Anyway, in order to figure out how popular any one video is, you need a pretty large sample of videos to rank it against. The challenge is that the sample needs to be very random in order to properly rank a video and YouTube doesn’t appear to provide a way to obtain large numbers of random video IDs.
Even if you search on YouTube for a random string, the set of results that will be returned will still be based on popularity, so if you’re using this approach to build up your sample, you’re already in trouble. It turns out there is a multitude of ways in which the YouTube search function makes it very difficult to retrieve truly random results.
So how can we provide truly random links to YouTube videos? It turns out that the YouTube programming interface (API) provides additional functions that allow the discovery of videos which, with the right approach, are much more random. Using a number of tricks, combined some subtle manipulation of the space-time fabric, we have managed to create a process that yields something very close to 100% random links to YouTube videos.
YouTube is an American video-sharing website headquartered in San Bruno, California. YouTube allows users to upload, view, rate, share, add to playlists, report, comment on videos, and subscribe to other users. It offers a wide variety of user-generated and corporate media videos. Available content includes video clips, TV show clips, music videos, short and documentary films, audio recordings, movie trailers, live streams, and other content such as video blogging, short original videos, and educational videos. Most content on YouTube is uploaded by individuals, but media corporations including CBS, the BBC, Vevo, and Hulu offer some of their material via YouTube as part of the YouTube partnership program. Unregistered users can only watch videos on the site, while registered users are permitted to upload an unlimited number of videos and add comments to videos. Videos deemed potentially inappropriate are available only to registered users affirming themselves to be at least 18 years old.
YouTube and selected creators earn advertising revenue from Google AdSense, a program which targets ads according to site content and audience. The vast majority of its videos are free to view, but there are exceptions, including subscription-based premium channels, film rentals, as well as YouTube Music and YouTube Premium, subscription services respectively offering premium and ad-free music streaming, and ad-free access to all content, including exclusive content commissioned from notable personalities. As of February 2017, there were more than 400 hours of content uploaded to YouTube each minute, and one billion hours of content being watched on YouTube every day. As of August 2018, the website is ranked as the second-most popular site in the world, according to Alexa Internet, just behind Google. As of May 2019, more than 500 hours of video content are uploaded to YouTube every minute.
YouTube has faced criticism over aspects of its operations, including its handling of copyrighted content contained within uploaded videos, its recommendation algorithms perpetuating videos that promote conspiracy theories and falsehoods, hosting videos ostensibly targeting children but containing violent and/or sexually suggestive content involving popular characters, videos of minors attracting pedophilic activities in their comment sections, and fluctuating policies on the types of content that is eligible to be monetized with advertising.