DOM Elements

There are two main classes for locating an element on the DOM of your site, described below.

class centaurminer.Element(method, selector)

Simple struct to store instructions to find an element on a page.

method

The name of the method used to locate the element. Common values include “id”, “xpath”, and “css_selector” - look at https://selenium-python.readthedocs.io/locating-elements.html for more info. This is passed directly into the selenium.webdriver.common.By constructor.

Type

str

selector

Combined with method, defines a way to find a specific element on the DOM. This is the actual xpath, css selector, etc., used to find the element.

Type

str

needsInstructions

If true, this element requires instructions in the centaurminer.MiningEngine class to extract - used for the Complex subclass.

Type

Boolean

get_attribute(attributeName)

Indicate which attribute of the element should be extracted.

Parameters

attributeName (str) – Name of the attribute to be extracted.

Returns

Used for chaining with the contructor.

Return type

Element

class centaurminer.MetaData(name)

A special type of Element that’s derived from the metadata

Parameters

name (str) – The ‘name’ or ‘property’ value of the Metadata this object points to.

There is an additional class, used when you want to indicate that there’s heavy post-processing required in the centaurminer.MiningEngine subclass.

class centaurminer.Complex

This represents an element that needs further directions to extract.

An error will be thrown when trying to gather data, if the centaurminer.MiningEngine doing the mining does not have an instruction for a Complex element.