Monitor Changes in a Website and Push Notification to Slack via Huginn and Slack API


0x00 Story

A fellow of mine wants to monitor changes in a website, at first he use a monitor plugin in chrome, but before long he realized that he cannot keep his computer running all the time, therefore to use an VPS to monitor  is a pretty decent choice.

Here comes the question, what software should we use? It comes to me that I can use Python to write a crawler, but I'm too lazy to write one of them. Therefore I'm thinking is there an easier way to do it? Well it is the time when Huginn and Slack come to my aid.

0x01 Huginn Website Agent Setup

Huginn Website Agent

First, we should create a Website Agent to capture and store the information on the website, setting the agent name , schedule and etc are kind of easy jobs, let's skip that part.

In options part, we need to specify the url of website we need to monitor and everything we need to extract from the url source, the example options are as follow:

In extract section, there are keys and values we will extract from website source, for instance, in the example options, there are five keys which are "status","room_nr","surface_area","additional_cost","orientation", and in each keys value section, there are two more key-value pair,which are "css" and "value" pair.

The "css" key and its value instruct Huginn to extract information from a pattern, this pattern can be XPath Expression(e.g. //*[@id="element"]/h3/a) or CSS selector(e.g.#mailContentContainer  table tbody td a). And the "value" key and its value instruct Huginn what to do with the content it extract under the instruction of "css", to be concise, this key-value pair helps Huginn extract the specific content we needed rather a an element of html(this is what Huggin extract in "css" part), and the value in "value" pair can be XPath Functions such as string/concat, the "." parameter in example code means the content Huginn extracts.

After setting up your website agent, you can test it to check whether it works, just simply click "Dry Run" Button, if the event output is exactly what you want, of course you can save your agent and drink a cup of coffee, but if it does not meet your expectation, you may pay attention to your XPath or its Function you use.

0x02 Slack Push Application Set up

Well now we are able to monitor changes in the target website, the next thing we need to do is to push the changes to Slack.

To achieve this goal, we need to build an app using Slack API: https://api.slack.com/apps?new_app=1

Create Slack App

App Name is free to choose but we should set the correct Development Slack Workspace to ensure that the message can be delivered to the right place.

Slack App Permission Settings

Now we have to grant some permission so that our App can push notification to the workspace. Click the "Permissions" in the App information page. Turn to the Scope, and Add permission "Send messages as (Your App's Name)" and "Send messages as user", then press save changes button. After Doing this, we should go to OAuth Tokens & Redirect URLs, and pblhorize this App.

Now refresh the page, you should be able to see OAuth Access Token like xoxp-xxx...xxx-xxxxx...xxxx-xxxx...xxxxx-xxxx...xxxx

Copy it, it is essential for us since it is the authorization token for Huginn to push notification.

0x03 Huggin Post Agent Setup

OK now let's go back to Huginn, Create a Post Agent.

After setting up agent name and sources, we should configure the agent with the example options listed below:

Pay attention that in Headers Section we should replace the xoxp-xxx...xxx-xxxxx...xxxx-xxxx...xxxxx-xxxx...xxxx with Authorization Token we have got in the last step.

And if you want to at someone, you can change the text field in attachments section by simply replace UserID with the desired one. the UserID can be found by the url while sending direct message to the user(e.g. https://edlinus.slack.com/messages/DBV26TRFS/ means DBV26TRFS)

OK now, click dry run button, select an output from the Website Agent we created before, and press OK, the output should be like this.

 

 

 

 

 

 

 

 

 

0x04 Conclusion

At any rate, Huginn is an handy tools for us if we are unable to write a crawler of our own, and it can meet the basic need of a crawler or something else. But if we want to go further, learning Python Scrapy or Selenium staff will help you build a more flexible and robust crawler.

声明:Edward Linus|版权所有,违者必究|如未注明,均为原创|本网站采用BY-NC-SA协议进行授权

转载:转载请注明原文链接 - Monitor Changes in a Website and Push Notification to Slack via Huginn and Slack API


[stay(s) for s in ['humble','diligent']]