Skip to main content

Olfeo OEM documentation

Running the SDK

Running the SDK

To use the SDK, you will need to have an instance of Database and of Categorizer

Categorizer is used to categorize a domain

Database is used to get metadata of URL or application categories

The best way to understand how the SDK works is to look at the example code in ;/src/main.go

Important

If you want to do categorization or detection using IP addresses instead of URLs, you will need to use Database.SetIpRangeMatcher to improve performance.

Main functions

src/main.go

This file contains example code to illustrate how to use the SDK to categorize URLs.

You can run the binary directly via docker using ./scripts/client.sh or manually using the toolchain of go.

NB: if you use the binary directly, at least two arguments are required: the redis DNS with -rdb -dsn and at least one domain to categorize

This code imports the package gitlab.olfeo.tech/data-tools/nexus/sdk/redisinasmuch as rdb from the SDK. It then creates the structure redisDatabase, which implements the interface Database. This interface is passed to the function categorize.NewCategorizer which categorizes the domains.

It returns two pieces of information:

  • the category of the domain by calling Categorizer.GetDomainCategoryAnd Database.GetCategoryInfo

  • the associated application (if it exists) and its category by calling Categorizer.GetDomainApplications,Database.GetApplicationInfoAndDatabase.GetApplicationCategoryInfo

sdk-sample$ cd src
src$ go run main.go -rdb-dsn redis://localhost:6379 dropbox.com
Using rdb categorizer from redis://localhost:6379:
dropbox.com: (15181) Online Data Storage
dropbox.com: application (27) Dropbox
application category: (10005) Online Software providing cloud back-ups and data storage infrastructure
sdk/categorize

To instantiate an instance of Categorizer, use the function NewCategorize Since ./src/sdk/categorize/handler.go with an instance of Database

// Categorizer correspond to the main interface that provide the categorization service
type Categorizer interface {
    // GetDomainCategory returns the correct category id of the given domain or url
    GetDomainCategory(ctx context.Context, domain string, urlPath string) (uint32, error)
    // GetDomainApplications returns the application ids of the given domain, in case where the domain is an IP, there can be multiple applications
    GetDomainApplications(ctx context.Context, domain string) ([]uint32, error)
    // GetAdvancedDomainInfo performs the same algorithm as GetDomainInfo but stores intermediate results
    GetAdvancedDomainInfo(ctx context.Context, domain string, urlPath string) (*AdvancedDomainInfo, error)
}
sdk/database

To instantiate an instance of Database, use the function NewDatabase Since ./src/sdk/database/redis/connect.go with a context.Context and one redis.UniversalClient.

type Database interface {
    // GetApplicationInfo returns metadata about the give applicationId
    //
    // Querying for a non-existent applicationId returns a sdk.NotInDatabase error
    GetApplicationInfo(ctx context.Context, applicationId uint32) (*ApplicationInfo, error)
    // GetDomainApplicationInfo returns application info about the give domain
    //
    // Querying for an unknown domain returns a sdk.NotInDatabase error
    GetDomainApplicationInfo(ctx context.Context, domain string) (*DomainApplicationInfo, error)
    // GetApplicationCategoryInfo returns metadata about the give applicationCategoryId
    //
    // Querying for a non-existent applicationCategoryId returns a sdk.NotInDatabase error
    GetApplicationCategoryInfo(ctx context.Context, applicationCategoryId uint32) (*ApplicationCategoryInfo, error)
    // GetCategoryInfo returns metadata about the give categoryId
    //
    // Querying for a non-existent categoryId returns a sdk.NotInDatabase error
    GetCategoryInfo(ctx context.Context, categoryId uint32) (*CategoryInfo, error)
    // GetThemeInfo returns metadata about the give categoryId
    //
    // Querying for a non-existent themeId returns a sdk.NotInDatabase error
    GetThemeInfo(ctx context.Context, themeId uint32) (*ThemeInfo, error)
    // GetLogoData returns the byte sequence for the logo (as a 64x64 pixel PNG image)
    //
    // Querying for a non-existant logoId returns a sdk.NotInDatabase error
    GetLogoData(ctx context.Context, logoId uint32) ([]byte, error)
    // GetDomainInfo returns the domain info associated with a given domain
    //
    // Querying for an unknown domain returns a sdk.NotInDatabase error
    GetDomainInfo(ctx context.Context, domain string) (*DomainInfo, error)
    // GetCategoryInfoList return a map of info on all categories in the database
    GetCategoryInfoList(ctx context.Context) (map[uint32]*CategoryInfo, error)
    // GetThemeInfoList returns a map of info on all themes in the database
    GetThemeInfoList(ctx context.Context) (map[uint32]*ThemeInfo, error)
    // GetThemeCategoryIds returns a list of category ids for a given theme
    //
    // Querying for an unknown theme returns an empty list
    GetThemeCategoryIds(ctx context.Context, themeId uint32) ([]uint32, error)
    // GetCategoryIds returns a list of all the categories in the database
    GetCategoryIds(ctx context.Context) ([]uint32, error)
    // GetThemeIds returns a list of all the themes in the database
    GetThemeIds(ctx context.Context) ([]uint32, error)
    // GetIpApplicationIds returns a list of all the applications in the database
    GetIpApplicationIds(ctx context.Context, ip string) ([]uint32, error)
    // GetApplicationCategoryIds returns a list of all the application categories in the database
    GetApplicationCategoryIds(ctx context.Context) ([]uint32, error)
    // GetCategoryApplicationIds returns a list of all the applications in the given application category in the database
    GetCategoryApplicationIds(ctx context.Context, applicationCategoryId uint32) ([]uint32, error)
    // HealthCheck returns an error if the database is not working correctly.
    //
    // The returned error will wrap the actual backend error
    HealthCheck(ctx context.Context) error
}
Detect applications by their IP

The SDK allows detection of applications by IP but its default behavior may not achieve high performances, since the default matching strategy is not specifically trageted towards this use case and looks for strict equality in the database rather than matching by range.

The method Database.SetIpRangeMatcher can be used to modify this behavior.

AutoUpdateMatcher, as the name suggests, will update the IP ranges automatically if there are any changes in the database.

ctx := context.Background()
db, err := rdb.NewDatabase(ctx, redisClient)
if err != nil {
    panic("initialize db rdb")
}
autoUpdateMatcher, err := matcher.NewAutoUpdateMatcher(ctx, db)
if err != nil {
    panic("initialize iprange autoUpdateMatcher")
}
db.SetIpRangeMatcher(autoUpdateMatcher)
// Then use `Database.GetIpApplicationIds` as usual.

Note: If being reactively up-to-date is not a requirement, it can be disabled by passing the option WithoutAutoUpdate At matcher.NewAutoUpdateMatcher