meta_oxide 0.1.1

Universal metadata extraction library supporting 13 formats (HTML Meta, Open Graph, Twitter Cards, JSON-LD, Microdata, Microformats, RDFa, Dublin Core, Web App Manifest, oEmbed, rel-links, Images, SEO) with 7 language bindings
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
# MetaOxide Java API Reference

Complete API documentation for the Java library.

## Table of Contents

- [Installation]#installation
- [Package Structure]#package-structure
- [Classes]#classes
- [Methods]#methods
- [Data Types]#data-types
- [Exceptions]#exceptions
- [Examples]#examples

## Installation

### Maven

```xml
<dependency>
    <groupId>com.metaoxide</groupId>
    <artifactId>meta-oxide</artifactId>
    <version>0.1.0</version>
</dependency>
```

### Gradle

```gradle
dependencies {
    implementation 'com.metaoxide:meta-oxide:0.1.0'
}
```

## Package Structure

```java
package com.metaoxide;

import com.metaoxide.MetaOxide;
import com.metaoxide.Metadata;
import com.metaoxide.OpenGraphData;
import com.metaoxide.TwitterCardData;
import com.metaoxide.MetaOxideException;
```

## Classes

### `MetaOxide`

Main class for extracting metadata from HTML. Implements `AutoCloseable`.

#### Constructor

```java
public MetaOxide(String html, String baseUrl) throws MetaOxideException
```

**Parameters:**
- `html` (String) - HTML content to parse
- `baseUrl` (String) - Base URL for resolving relative URLs

**Throws:**
- `MetaOxideException` - If HTML parsing fails

**Example:**
```java
try (MetaOxide extractor = new MetaOxide(html, "https://example.com")) {
    // Use extractor
}
```

**Important:** Use try-with-resources to ensure proper resource cleanup.

## Methods

### `extractAll()`

Extract all metadata formats.

```java
public Metadata extractAll() throws MetaOxideException
```

**Returns:** `Metadata` - All extracted metadata

**Throws:** `MetaOxideException` - If extraction fails

**Example:**
```java
try (MetaOxide extractor = new MetaOxide(html, url)) {
    Metadata metadata = extractor.extractAll();
    System.out.println("Title: " + metadata.get("title"));
}
```

### `extractBasicMeta()`

Extract basic HTML metadata.

```java
public Metadata extractBasicMeta() throws MetaOxideException
```

**Returns:** `Metadata` - Basic HTML metadata

**Example:**
```java
Metadata basic = extractor.extractBasicMeta();
System.out.println("Title: " + basic.get("title"));
System.out.println("Description: " + basic.get("description"));
```

### `extractOpenGraph()`

Extract Open Graph metadata.

```java
public OpenGraphData extractOpenGraph() throws MetaOxideException
```

**Returns:** `OpenGraphData` or `null` if not present

**Example:**
```java
OpenGraphData og = extractor.extractOpenGraph();
if (og != null) {
    System.out.println("OG Title: " + og.getTitle());
    System.out.println("OG Image: " + og.getImage());
}
```

### `extractTwitterCard()`

Extract Twitter Card metadata.

```java
public TwitterCardData extractTwitterCard() throws MetaOxideException
```

**Returns:** `TwitterCardData` or `null` if not present

**Example:**
```java
TwitterCardData twitter = extractor.extractTwitterCard();
if (twitter != null) {
    System.out.println("Card Type: " + twitter.getCard());
}
```

### `extractJSONLD()`

Extract JSON-LD structured data.

```java
public List<JSONLDData> extractJSONLD() throws MetaOxideException
```

**Returns:** `List<JSONLDData>` - List of JSON-LD objects

**Example:**
```java
List<JSONLDData> jsonld = extractor.extractJSONLD();
for (JSONLDData item : jsonld) {
    System.out.println("Type: " + item.getType());
}
```

### `extractMicrodata()`

Extract Microdata items.

```java
public List<MicrodataItem> extractMicrodata() throws MetaOxideException
```

**Returns:** `List<MicrodataItem>` - List of Microdata items

**Example:**
```java
List<MicrodataItem> microdata = extractor.extractMicrodata();
for (MicrodataItem item : microdata) {
    System.out.println("Type: " + item.getType());
}
```

### `extractMicroformats()`

Extract Microformats data.

```java
public Map<String, List<Metadata>> extractMicroformats() throws MetaOxideException
```

**Returns:** `Map<String, List<Metadata>>` - Microformats grouped by type

**Example:**
```java
Map<String, List<Metadata>> mf = extractor.extractMicroformats();
if (mf.containsKey("h-card")) {
    for (Metadata card : mf.get("h-card")) {
        System.out.println("Name: " + card.get("name"));
    }
}
```

### `extractDublinCore()`

Extract Dublin Core metadata.

```java
public DublinCoreData extractDublinCore() throws MetaOxideException
```

**Returns:** `DublinCoreData` or `null` if not present

### `extractRelLinks()`

Extract link relations.

```java
public Map<String, List<Link>> extractRelLinks() throws MetaOxideException
```

**Returns:** `Map<String, List<Link>>` - Links grouped by rel type

**Example:**
```java
Map<String, List<Link>> links = extractor.extractRelLinks();
if (links.containsKey("canonical")) {
    System.out.println("Canonical: " + links.get("canonical").get(0).getHref());
}
```

### `close()`

Close and release native resources. Automatically called by try-with-resources.

```java
public void close()
```

## Data Types

### `Metadata`

Generic metadata container extending `HashMap<String, Object>`.

```java
public class Metadata extends HashMap<String, Object> {
    // Convenience methods
    public String getString(String key);
    public Integer getInt(String key);
    public List<String> getStringList(String key);
}
```

### `OpenGraphData`

Open Graph metadata.

```java
public class OpenGraphData {
    private String title;
    private String type;
    private String image;
    private String url;
    private String description;
    private String siteName;
    private String locale;

    // Getters
    public String getTitle() { return title; }
    public String getType() { return type; }
    public String getImage() { return image; }
    public String getUrl() { return url; }
    public String getDescription() { return description; }
    public String getSiteName() { return siteName; }
    public String getLocale() { return locale; }
}
```

### `TwitterCardData`

Twitter Card metadata.

```java
public class TwitterCardData {
    private String card;
    private String site;
    private String creator;
    private String title;
    private String description;
    private String image;
    private String imageAlt;

    // Getters
    public String getCard() { return card; }
    public String getSite() { return site; }
    public String getCreator() { return creator; }
    public String getTitle() { return title; }
    public String getDescription() { return description; }
    public String getImage() { return image; }
    public String getImageAlt() { return imageAlt; }
}
```

### `MicrodataItem`

Microdata item.

```java
public class MicrodataItem {
    private List<String> type;
    private Map<String, List<Object>> properties;
    private String id;

    // Getters
    public List<String> getType() { return type; }
    public Map<String, List<Object>> getProperties() { return properties; }
    public String getId() { return id; }
}
```

### `Link`

Link element.

```java
public class Link {
    private String href;
    private String rel;
    private String media;
    private String title;
    private String type;
    private String hreflang;

    // Getters
    public String getHref() { return href; }
    public String getRel() { return rel; }
    // ... other getters
}
```

## Exceptions

### `MetaOxideException`

Base exception for MetaOxide errors.

```java
public class MetaOxideException extends Exception {
    public MetaOxideException(String message) {
        super(message);
    }

    public MetaOxideException(String message, Throwable cause) {
        super(message, cause);
    }
}
```

### Exception Handling

```java
try (MetaOxide extractor = new MetaOxide(html, url)) {
    Metadata metadata = extractor.extractAll();
    // Use metadata
} catch (MetaOxideException e) {
    System.err.println("Extraction failed: " + e.getMessage());
} catch (Exception e) {
    System.err.println("Unexpected error: " + e.getMessage());
}
```

## Examples

### Basic Usage

```java
import com.metaoxide.MetaOxide;
import com.metaoxide.Metadata;

public class Example {
    public static void main(String[] args) {
        String html = "<!DOCTYPE html>...";

        try (MetaOxide extractor = new MetaOxide(html, "https://example.com")) {
            Metadata metadata = extractor.extractAll();
            System.out.println("Title: " + metadata.get("title"));
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}
```

### Extract from URL

```java
import java.net.URI;
import java.net.http.HttpClient;
import java.net.http.HttpRequest;
import java.net.http.HttpResponse;

public class URLExtractor {
    public static Metadata extractFromURL(String url) throws Exception {
        HttpClient client = HttpClient.newHttpClient();
        HttpRequest request = HttpRequest.newBuilder()
                .uri(URI.create(url))
                .build();

        HttpResponse<String> response = client.send(request,
                HttpResponse.BodyHandlers.ofString());

        try (MetaOxide extractor = new MetaOxide(response.body(), url)) {
            return extractor.extractAll();
        }
    }
}
```

### Parallel Processing

```java
import java.util.List;
import java.util.stream.Collectors;

public class ParallelExtractor {
    public static List<Metadata> extractMultiple(List<String> urls) {
        return urls.parallelStream()
                .map(url -> {
                    try {
                        return extractFromURL(url);
                    } catch (Exception e) {
                        System.err.println("Failed: " + url);
                        return null;
                    }
                })
                .filter(m -> m != null)
                .collect(Collectors.toList());
    }
}
```

### Spring Boot Service

```java
import org.springframework.stereotype.Service;
import com.metaoxide.MetaOxide;
import com.metaoxide.Metadata;

@Service
public class MetadataService {
    public Metadata extract(String html, String url) throws MetaOxideException {
        try (MetaOxide extractor = new MetaOxide(html, url)) {
            return extractor.extractAll();
        }
    }
}
```

### Android Usage

```java
import android.os.AsyncTask;
import com.metaoxide.MetaOxide;
import com.metaoxide.Metadata;

public class MetadataTask extends AsyncTask<String, Void, Metadata> {
    @Override
    protected Metadata doInBackground(String... params) {
        String url = params[0];
        try {
            String html = fetchHTML(url);
            try (MetaOxide extractor = new MetaOxide(html, url)) {
                return extractor.extractAll();
            }
        } catch (Exception e) {
            e.printStackTrace();
            return null;
        }
    }

    @Override
    protected void onPostExecute(Metadata metadata) {
        if (metadata != null) {
            // Update UI
        }
    }
}
```

### Kotlin Usage

```kotlin
import com.metaoxide.MetaOxide

fun extractMetadata(html: String, url: String): Metadata? {
    return try {
        MetaOxide(html, url).use { extractor ->
            extractor.extractAll()
        }
    } catch (e: Exception) {
        null
    }
}

// Usage
val metadata = extractMetadata(html, "https://example.com")
metadata?.let {
    println("Title: ${it["title"]}")
}
```

## Thread Safety

MetaOxide is thread-safe. Each thread should create its own instance.

```java
// Safe: Each thread has its own extractor
ExecutorService executor = Executors.newFixedThreadPool(4);
urls.forEach(url -> {
    executor.submit(() -> {
        try (MetaOxide extractor = new MetaOxide(html, url)) {
            // Safe
        }
    });
});
```

## Performance Tips

1. **Use try-with-resources**: Ensures proper cleanup
2. **Parallel Streams**: Process multiple URLs concurrently
3. **Selective Extraction**: Extract only needed formats
4. **Connection Pooling**: Reuse HTTP client instances

## See Also

- [Getting Started Guide]/docs/getting-started/getting-started-java.md
- [Spring Boot Integration]/docs/integrations/spring-boot-integration.md
- [Examples]/examples/real-world/java-spring-service/