in Building NiceFeed, Programming

Building an RSS reader for Android from nothing: RSS feed retrieval and parsing

I’ve been meaning to write a series of posts about how I built NiceFeed—not a tutorial nor a step-by-step kind of thing, but more like a walkthrough of all the moving parts, how they came to be, and how they all fit together. I present it neither as an example of a great app nor of great programming—and naturally there are dozens of things that can be improved—but as a whole, I’m quite fond of it and use it everyday to get my news on my Android phone. My hope is to clarify some of my thinking about building the app and solving the many problems I encountered, so that it might be applied to future problems and projects. And of course, I hope that anybody who happens to stumble upon these posts may find them somehow instructive if not merely amusing.

RSS has been around for a long time and there are already many other readers and aggregators out there, but I’ve found many of them, particularly the free ones, awkward or difficult to navigate and jam-packed with features I don’t really need. My goal was an attractive and intuitive app, fully functional and with not too many frills.

*

An RSS feed appears in the wild as XML-formatted plain text, which needs to be parsed and converted to a form usable by an RSS aggregator application. I found that there were several third-party libraries for parsing RSS that already existed, so I didn’t have to write my own parsing code. I chose the aptly named RSS Parser because it seemed to be regularly updated and well documented. In addition, the library not only parses raw XML data, but executes its own HTTP requests, making it very easy to get up and running.

To start, I had to decide on what specific data I wanted the app to obtain. RSS Parser, after retrieving and parsing an RSS feed from the web, returns a Channel (or, the feed itself) object, which contains a collection of Articles (the contents of that feed). Both Channel and Article objects contain properties (title, author, URL, image, etc.) we definitely want and some that we don’t need. So, I created my own data classes, Feed and Entry, to contain only those properties that I wanted. Between the two is a many-to-many relationship: a feed is associated with many entries, and a single entry can be associated with more than one feed—we’ll return to this particular detail another time.

Determining which properties to include and exclude involved a lot of hemming and hawing as I became more familiar with the RSS data I was getting. I also added a few of my own: in Feed, category, which defaults to “Uncategorized,” and unreadCount, which simply holds the number of entries associated with that feed that have are not marked as read; and in Entry, the properties isStarred and isRead, both pretty self-explanatory and necessary if we want to be able to keep track of whether entries have been read and/or starred (or marked as favorite).

@Entity
data class Feed(
    @PrimaryKey val url: String, // Doubles as Feed ID
    var title: String,
    val website: String,
    val description: String? = null,
    val imageUrl: String? = null,
    var category: String = "Uncategorized",
    var unreadCount: Int
): Serializable

@Entity
data class Entry(
    @PrimaryKey val url: String, // Doubles as Entry ID
    val title: String,
    val website: String,
    val author: String?,
    val date: Date?,
    val content: String?,
    val image: String?,
    var isStarred: Boolean = false,
    var isRead: Boolean = false
) : Serializable {

    ...
}

Now for the real action. I created a class called FeedParser to contain all RSS Parser-related code. This class would contain all the needed methods for interacting with RSS Parser, and act as the authority across the entire app on retrieving and parsing RSS feeds from the internet. Naturally the class evolved over time as the app grew, but its role was always the same. Note: RSS Parser uses Kotlin Coroutines so the methods that use it have to be “suspend” functions.

/*  Responsible for retrieving and parsing RSS feeds */
class FeedParser(private val networkMonitor: NetworkMonitor) {

    private lateinit var rssParser: Parser
    private val _feedRequestLiveData = MutableLiveData<FeedWithEntries>()
    val feedRequestLiveData: LiveData<FeedWithEntries?>
        get() = _feedRequestLiveData

    suspend fun getFeedSynchronously(url: String): FeedWithEntries? {
        rssParser = Parser.Builder().build()
        return if (networkMonitor.isOnline) {
            try {
                val channel = rssParser.getChannel(url)
                ChannelMapper.makeFeedWithEntries(url, channel)
            } catch(e: Exception) {
                null
            }
        } else null
    }

    suspend fun requestFeed(url: String, backup: String? = null) {
        rssParser = Parser.Builder().build()
        if (networkMonitor.isOnline) {
            BackupUrlManager.setBase(backup)
            executeRequest(url)
        } else _feedRequestLiveData.postValue(null)
    }

    fun cancelRequest() {
        rssParser.cancel()
        BackupUrlManager.reset()
    }

    private suspend fun executeRequest(url: String) {
        // Automatically makes several requests with different possible URLs
        Log.d(TAG, "Requesting $url")

        try {
            val channel = rssParser.getChannel(url)
            val feedWithEntries = ChannelMapper.makeFeedWithEntries(url, channel)
            _feedRequestLiveData.postValue(feedWithEntries)
        } catch (e: Exception) {
            // If the initial request fails, try backup URL in different variations
            BackupUrlManager.getNextUrl()?.let { executeRequest(it) }
                ?: let {
                    _feedRequestLiveData.postValue(null)
                    Log.d(TAG, "Request failed")
                }
        }
    }

    ...

    companion object {

        private const val TAG = "FeedParser"
        private const val UNTITLED = "Untitled"
        const val FLAG_EXCERPT = "com.joshuacerdenia.android.nicefeed.excerpt "
    }
}

FeedParser takes a NetworkMonitor as an injected dependency. We’ll take a look at that at a later time, but it’s very simple: just an object that monitors the device’s internet connectivity. It contains one public property: isOnline, which at any given time is either true or false. All web requests are first checked against this property before executing.

There are three public methods, the first two of which are very similar, and begin with a new instance of RSS Parser:

First, getFeedSynchronously, which is really only for retrieving a feed as a background task. It takes a String called url (the address of an RSS feed), uses it to make a web request, and returns a FeedWithEntries: an additional data class that combines one Feed with a list of associated Entries. As it is meant to run in a background thread, the function waits for the web request to be completed before proceeding to the next step—hence, it is synchronous.

data class FeedWithEntries(
    @Embedded val feed: Feed,
    @Relation(
        parentColumn = "url",
        entityColumn = "url",
        associateBy = Junction(
            value = FeedEntryCrossRef::class,
            parentColumn = "feedUrl",
            entityColumn = "entryUrl"
        )
    )
    val entries: List<Entry>
)

Second, requestFeed, like the above, takes a String URL, and optionally a “backup” or second URL. Instead of returning anything, the main URL is passed first to a private method, executeRequest, which does the actual requesting via the current instance of RSS Parser. An object called BackupUrlManager notes the backup URL, if any, and hangs on to it until needed. We’ll take a closer look at it at a future time, but for now, all we need to know is that it generates different variations of the backup URL that can be used to retrieve a particular RSS feed.

Regarding the private method executeRequest: if there are any of these aforementioned variations of the backup URL, the method repeats itself via recursion until all variations are exhausted. Upon a successful request, the result is posted asynchronously to the class-level property feedRequestLiveData, which can then be read by whichever part of the app initiated the request.

The third public method is cancelRequest, which simply cancels the current instance of RSS Parser (and with it any pending request), and clears the BackupUrlManager.

You’ll notice also that in all of the above methods except cancelRequest, there is a reference to an object called ChannelMapper. As I said earlier, the RSS Parser object returns a Channel, which contains several properties as well as a collection of Articles. ChannelMapper is nested within FeedParser and contains methods for converting a Channel into a FeedWithEntries (again, a combination of one Feed and multiple Entry objects). This is just my way of organizing the code and keeping these methods in one place.

/*  Maps 'Channel' data into 'Feed' and 'Entry' objects */
private object ChannelMapper {

    private const val MAX_ENTRIES = 300 // Arbitrary
    private const val DATE_PATTERN = "EEE, d MMM yyyy HH:mm:ss Z"

    fun makeFeedWithEntries(url: String, channel: Channel): FeedWithEntries {
        val entries = mapEntries(channel, url)
        val feed = Feed(
            url = url, // The url that successfully completes the request is applied
            website = channel.link ?: url,
            title = channel.title ?: channel.link?.shortened() ?: url.shortened(),
            description = channel.description,
            imageUrl = channel.image?.url ?: channel.image?.link,
            unreadCount = entries.size
            )

        Log.d(TAG, "Retrieved ${entries.size} entries from $url")
        return FeedWithEntries(feed, entries)
    }

    private fun mapEntries(channel: Channel, url: String): List<Entry> {
        val entries = mutableListOf<Entry>()
        for (article in channel.articles) {
            if (entries.size < MAX_ENTRIES) {
                val entry = Entry(
                    url = article.link ?: article.guid ?: "",
                    website = channel.link ?: url,
                    title = article.title ?: UNTITLED,
                    author = article.author,
                    content = article.content ?: article.description.flagAsExcerpt(),
                    date = parseDate(article.pubDate),
                    image = article.image
                )
                entries.add(entry)
            } else break
        }
        return entries
    }

    private fun parseDate(stringDate: String?): Date? {
        return if (stringDate != null) {
            SimpleDateFormat(DATE_PATTERN, Locale.ENGLISH).parse(stringDate)
        } else null
    }

    private fun String?.flagAsExcerpt() = FLAG_EXCERPT + this
}

Here, the one public method, makeFeedWithEntries, accepts a String URL and Channel object and initiates the process of assigning all the data we need to properties that we specified earlier in the data classes Feed and Entry, and discarding the rest. The private method mapEntries, which makeFeedWithEntries calls within itself, does the same by looping through each Article contained in the Channel. In the end, we get a FeedWithEntries, ready to be stored or presented by the app.

Side note: at the bottom of ChannelMapper is a method flagAsExcerpt which extends a nullable String. I use it to flag any Entry whose content property is null and the description is not empty—in the wild, it means the entry is probably an excerpt. Many RSS feeds nowadays, especially from subscription sources, do not syndicate full versions of their content, only short excerpts. I have yet to do anything with flagged Entries, but might in the future: for example, the app could be made to open any Entry flagged as an excerpt automatically with the device’s default browser, instead of within the app.

And a final remark: I’ve written all my code with modularity and flexibility in mind, to the extent that I’m able. RSS Parser serves all my current needs but is not a perfect library—notably, it does not support Atom, and in the future I might want to use a different one. Since all the relevant code is contained entirely within the class FeedParser, we could easily create a new class with which to replace it as the final authority on all things related to retrieving and parsing RSS, without affecting much else in the app.

Write a Comment

Comment

Webmentions

  • Building an RSS Reader for Android #3: Setting up a relational database, many-to-many relationships, and Room – Joshua Cerdenia

    […] in this series, I wrote about FeedParser and FeedSearcher, each representing a data source for the app. The former is responsible for […]

  • Building an RSS Reader for Android from nothing #2: Searching for feeds using Feedly’s Search API – Joshua Cerdenia

    […] Previously, I wrote about the component that handles retrieving and parsing raw RSS feeds from the web. This component represents one of NiceFeed’s main data sources, giving us an actual RSS feed, in the form of interrelated model objects that we’ve defined (Feed, Entry, FeedWithEntries), which the app can then display. All we need to do is give it the URL of any particular RSS feed, and it does all the work. […]

  • Building an RSS Reader for Android from nothing, part 2: Searching for feeds using Feedly’s Search API – Joshua Cerdenia

    […] Previously, I wrote about the component that handles retrieving and parsing raw RSS feeds from the web. This component represents one of NiceFeed’s main data sources, giving us an actual RSS feed, in the form of interrelated model objects that we’ve defined (Feed, Entry, FeedWithEntries), which the app can then display. All we need to do is give it the URL of any particular RSS feed, and it does all the work. […]