How to Read Website in SwiftUI | Ege Sucu Blog

Net scraping in SwiftUI made simple

We’re dwelling in a century during which utilizing API’s frequent. We, as cell builders, are used to encoding and decoding JSON information to run our app with the server. Sadly, not all web sites/companies present API. Typically, you should learn the web site to get what you need. We name this Information Scraping. It consists of buying information on the web page with the assistance of filtering CSS selectors. How can we get information in our SwiftUI app?

For this function, we want a software that may parse and browse HTML physique correctly. I selected SwiftSoup, a 100% Swift library, to work on this. You’ll be able to add it as a CocoaPods pod or Swift Package deal, because it’s my precedence.

As for in the present day’s instance, I wish to create a easy Weblog Reader app that parses swiftbysundell.com’s articles. Let’s examine its HTML scheme by right-clicking and seeing its code.

What we take care of is, All articles are summed up within <ul class=”item-list> they usually have the physique of <article>. Inside an article, we’ve an <H1> as our title, a date within <span class=”date”> and a URL of the article with <a href=””>. We have to scrap these information to create our article mannequin.

First, I create a easy struct that may maintain our information. I’ll make it identifiable and hashable to correctly work with SwiftUI’s Record & ForEach Views.

Subsequent, I’ll create the view. For this, I’ll make a searchable listing to navigate to the article’s web page in Safari once I click on the cell.

First, I create a outcome array that may maintain our articles, relying on the search time period. The search won’t solely lookup the article’s title but additionally its date.

After this, I’ll create a bit construction that may present a header and footer. There will probably be two constructions for in the present day’s posts and former posts.

Subsequent, I’ll create a easy fetch operate that may fetch information from our dataModel.

ArticleCell is a straightforward sub-view during which I’ll present the information.

I additionally wrote some Date extensions that may assist me filter in the present day’s posts from older ones and format the date how I need it to look.

Now we have to implement our Information Service. I’ll create an ObservableObject with a Printed variable named articleList and a baseURL which I’ll use.

I’ll write a fetchArticles operate that may

  • erase the array. It’s crammed,
  • get the entire web site as a string,
  • and parse the string as HTML with the assistance of Swiftsoup.

We’ll do the 2nd, and third steps utilizing a do catch block since these operations may throw an error.

let articles = attempt doc.getElementsByClass(“item-list”).choose(“article”)

This may navigate us into the article array of the web site. Observe that there will probably be an array with doc.getElementsByClass name, so we’ll do a for loop to deal with each information individually.

let title = attempt article.choose(“a”).first()?.textual content(trimAndNormaliseWhitespace: true) ?? “”

This may choose the tag and get the textual content inside it with none whitespace if there’s any.

let url = attempt baseURL.appendingPathComponent(article.choose(“a”).attr(“href”))

This may get the URL we want.

let dateString = attempt article.choose(“div”).choose(“span”).textual content().replacingOccurrences(of: “Printed on “, with: “”).replacingOccurrences(of: “Remastered on “, with: “”).replacingOccurrences(of: “Answered on “, with: “”).trimmingCharacters(in: .whitespacesAndNewlines)

This lengthy code will fetch the information as a String. Because it may embrace some texts, we have to strip them away and clear any non-visible white areas.

I’d additionally convert this String as a date, thus I apply the DateFormatter.

let formatter = DateFormatter(dateFormat: “dd MMM yyyy”)let date = Calendar.present.startOfDay(for: formatter.date(from: dateString) ?? Date.now)

Finally, I’ll create my information and append it to the mannequin.

let put up = Article(title: title, url: url, publishDate: date)self.articleList.append(put up)

The service class will appear to be this.

That’s it. Our app is working high quality. You’ll be able to try the repo for the entire challenge.

The End result

More Posts