/images/avatar.png

Scraping data from website with infinite scrolling

R is a popular language for scraping websites and has plenty of packages for scraping static websites. However, dynamically generated websites are growing in popularity. These are harder to scrape, as the content is generated after we load the website or do some events on the website. Luckily, R has a solution even for this. Package RSelenium which enables us to connect to the Selenium server. You can learn more about this package from its vignette.

Creating singleton pattern in S4

In my previous post, we have been creating a package with a connection to the database. However, this connection was exposed as a global object, which could then be freely accessed by anyone. I didn’t like this approach, so I have decided to try to find a way how to encapsulate it and hide this object from the user of the package. As someone, who used to work with Java and Scala before transitioning to R, singleton immediately came into my mind.

Scraping financial statements of Slovak financial entities

It is always interesting to go back to your older projects. You can spot, how is your coding style evolving, and how you, as a programmer, are improving. Recently, I had to go through the code of one of my first projects in R and boy, was it a mess. It was supposed to download Financial Statements of all the businesses in Slovakia for a certain year. It worked, barely. But trying to understand the code was a pain.