Apache Geodeのクイックガイド

1. 概要

Apache Geodeは、キャッシュとデータ計算をサポートする分散型インメモリデータグリッドです。

このチュートリアルでは、Geodeの主要な概念について説明し、Javaクライアントを使用していくつかのコードサンプルを実行します。

2. セットアップ

まず、Apache Geodeをダウンロードしてインストールし、gfsh environmentを設定する必要があります。これを行うには、Geode’s official guideの指示に従います。

次に、このチュートリアルでは、いくつかのファイルシステムアーティファクトを作成します。したがって、一時ディレクトリを作成してそこから起動することで、それらを分離できます。

2.1. インストールと構成

一時ディレクトリから、Locatorインスタンスを開始する必要があります。

gfsh> start locator --name=locator --bind-address=localhost

Locators are responsible for the coordination between different members of a Geode Cluster,は、JMXを介してさらに管理できます。

次に、Serverインスタンスを開始して、1つ以上のデータRegionsをホストします。

gfsh> start server --name=server1 --server-port=0

Geodeが使用可能なポートを選択するように、–server-portオプションを0に設定します。ただし、省略した場合、サーバーはデフォルトのポート40404を使用します。 A server is a configurable member of the Cluster that runs as a long-lived process and is responsible for managing data Regions。

そして最後に、Regionが必要です。

gfsh> create region --name=example --type=REPLICATE

Regionは、最終的にデータを保存する場所です。

2.2. 検証

先に進む前に、すべてが機能していることを確認しましょう。

まず、Server とLocatorがあるかどうかを確認しましょう。

gfsh> list members
 Name   | Id
------- | ----------------------------------------------------------
server1 | 192.168.0.105(server1:6119):1024
locator | 127.0.0.1(locator:5996:locator):1024 [Coordinator]

次に、Regionがあります。

gfsh> describe region --name=example
..........................................................
Name            : example
Data Policy     : replicate
Hosting Members : server1

Non-Default Attributes Shared By Hosting Members

 Type  |    Name     | Value
------ | ----------- | ---------------
Region | data-policy | REPLICATE
       | size        | 0
       | scope       | distributed-ack

また、「locator」および「server1」と呼ばれる一時ディレクトリの下のファイルシステムにいくつかのディレクトリが必要です。

この出力により、次に進む準備ができていることがわかります。

3. メーベン依存

実行中のGeodeができたので、クライアントコードを見てみましょう。

JavaコードでGeodeを操作するには、Apache Geode Java clientライブラリをpomに追加する必要があります。


     org.apache.geode
     geode-core
     1.6.0

まず、いくつかのデータをいくつかの地域に保存して取得することから始めましょう。

4. シンプルな保管と検索

単一の値、値のバッチ、およびカスタムオブジェクトを保存する方法を示しましょう。

「例」の領域にデータの保存を開始するには、ロケーターを使用してデータに接続しましょう。

@Before
public void connect() {
    this.cache = new ClientCacheFactory()
      .addPoolLocator("localhost", 10334)
        .create();
    this.region = cache.
      createClientRegionFactory(ClientRegionShortcut.CACHING_PROXY)
        .create("example");
}

4.1. 単一値の保存

これで、地域のデータを簡単に保存および取得できます。

@Test
public void whenSendMessageToRegion_thenMessageSavedSuccessfully() {

    this.region.put("A", "Hello");
    this.region.put("B", "example");

    assertEquals("Hello", region.get("A"));
    assertEquals("example", region.get("B"));
}

4.2. 一度に複数の値を保存する

また、ネットワーク遅延を削減しようとする場合など、複数の値を一度に保存することもできます。

@Test
public void whenPutMultipleValuesAtOnce_thenValuesSavedSuccessfully() {

    Supplier> keys = () -> Stream.of("A", "B", "C", "D", "E");
    Map values = keys.get()
        .collect(Collectors.toMap(Function.identity(), String::toLowerCase));

    this.region.putAll(values);

    keys.get()
        .forEach(k -> assertEquals(k.toLowerCase(), this.region.get(k)));
}

4.3. カスタムオブジェクトの保存

文字列は便利ですが、遅かれ早かれカスタムオブジェクトを保存する必要があります。

次のキータイプを使用して保存したい顧客レコードがあると想像してみましょう。

public class CustomerKey implements Serializable {
    private long id;
    private String country;

    // getters and setters
    // equals and hashcode
}

そして、次の値タイプ：

public class Customer implements Serializable {
    private CustomerKey key;
    private String firstName;
    private String lastName;
    private Integer age;

    // getters and setters
}

これらを保存できるようにするための追加の手順がいくつかあります。

まず、they should implement Serializable. これは厳密な要件ではありませんが、Serializable,をGeode can store them more robustlyにすることで。

次に、they need to be on our application’s classpath as well as the classpath of our Geode Server。

それらをサーバーのクラスパスに取得するには、mvn clean packageを使用してパッケージ化します。

次に、結果のjarを新しいstart serverコマンドで参照できます。

gfsh> stop server --name=server1
gfsh> start server --name=server1 --classpath=../lib/apache-geode-1.0-SNAPSHOT.jar --server-port=0

繰り返しますが、これらのコマンドは一時ディレクトリから実行する必要があります。

最後に、「example」領域の作成に使用したのと同じコマンドを使用して、Serverに「example-customers」という名前の新しいRegionを作成しましょう。

gfsh> create region --name=example-customers --type=REPLICATE

コードでは、カスタムタイプを指定して、以前と同じようにロケーターに連絡します。

@Before
public void connect() {
    // ... connect through the locator
    this.customerRegion = this.cache.
      createClientRegionFactory(ClientRegionShortcut.CACHING_PROXY)
        .create("example-customers");
}

そして、その後、以前と同様に顧客を保管できます。

@Test
public void whenPutCustomKey_thenValuesSavedSuccessfully() {
    CustomerKey key = new CustomerKey(123);
    Customer customer = new Customer(key, "William", "Russell", 35);

    this.customerRegion.put(key, customer);

    Customer storedCustomer = this.customerRegion.get(key);
    assertEquals("William", storedCustomer.getFirstName());
    assertEquals("Russell", storedCustomer.getLastName());
}

5. 地域タイプ

ほとんどの環境では、読み取りと書き込みのスループット要件に応じて、リージョンの複数のコピーまたは複数のパーティションがあります。

これまで、メモリ内のレプリケートされた領域を使用してきました。よく見てみましょう。

5.1. 複製された領域

名前が示すように、a Replicated Region maintains copies of its data on more than one Server. これをテストしてみましょう。

作業ディレクトリのgfsh consoleから、server2という名前のServerをもう1つクラスタに追加しましょう。

gfsh> start server --name=server2 --classpath=../lib/apache-geode-1.0-SNAPSHOT.jar --server-port=0

「例」を作成したときは、–type=REPLICATEを使用したことを思い出してください。このため、Geode will automatically replicate our data to the new server.

server1:を停止してこれを確認しましょう

gfsh> stop server --name=server1

次に、「例」の領域で簡単なクエリを実行してみましょう。

データが正常に複製された場合、結果が返されます。

gfsh> query --query='select e.key from /example.entries e'
Result : true
Limit  : 100
Rows   : 5

Result
------
C
B
A
E
D

したがって、レプリケーションが成功したように見えます！

リージョンにレプリカを追加すると、データの可用性が向上します。また、複数のサーバーがクエリに応答できるため、読み取りスループットも向上します。

ただし、what if they both crash? Since these are in-memory regions, the data will be lost. の場合は、代わりに–type=REPLICATE_PERSISTENTを使用して、複製中にデータをディスクに保存することもできます。

5.2. パーティション領域

大規模なデータセットでは、リージョンを個別のパーティションまたはバケットに分割するようにGeodeを構成することにより、システムをより適切にスケーリングできます。

「example-partitioned」という名前のパーティション化されたRegionを1つ作成しましょう。

gfsh> create region --name=example-partitioned --type=PARTITION

データを追加します。

gfsh> put --region=example-partitioned --key="1" --value="one"
gfsh> put --region=example-partitioned --key="2" --value="two"
gfsh> put --region=example-partitioned --key="3" --value="three"

そしてすぐに確認します：

gfsh> query --query='select e.key, e.value from /example-partitioned.entries e'
Result : true
Limit  : 100
Rows   : 3

key | value
--- | -----
2   | two
1   | one
3   | three

次に、データがパーティション化されたことを検証するために、server1を再度停止して、次のクエリを再実行します。

gfsh> stop server --name=server1
gfsh> query --query='select e.key, e.value from /example-partitioned.entries e'
Result : true
Limit  : 100
Rows   : 1

key | value
--- | -----
2   | two

そのサーバーにはデータのパーティションが1つしかないため、今回は一部のデータエントリしか取得できませんでした。そのため、server1が削除されると、そのデータは失われました。

But what if we need both partitioning and redundancy? Geodeはa number of other typesもサポートしています。次の3つが便利です。

PARTITION_REDUNDANTパーティションandは、クラスターのさまざまなメンバー間でデータを複製します
PARTITION_PERSISTENTは、PARTITIONのようにデータをパーティション化しますが、ディスクに、そして
PARTITION_REDUNDANT_PERSISTENT は、3つの動作すべてを提供します。

6. オブジェクトクエリ言語

Geodeは、オブジェクトクエリ言語（OQL）もサポートしています。これは、単純なキールックアップよりも強力です。 SQLに少し似ています。

この例では、前に作成した「example-customer」リージョンを使用しましょう。

さらに顧客を追加した場合：

Map data = new HashMap<>();
data.put(new CustomerKey(1), new Customer("Gheorge", "Manuc", 36));
data.put(new CustomerKey(2), new Customer("Allan", "McDowell", 43));
this.customerRegion.putAll(data);

次に、QueryServiceを使用して、名が「Allan」である顧客を見つけることができます。

QueryService queryService = this.cache.getQueryService();
String query =
  "select * from /example-customers c where c.firstName = 'Allan'";
SelectResults results =
  (SelectResults) queryService.newQuery(query).execute();
assertEquals(1, results.size());

7. 関数

インメモリデータグリッドのより強力な概念の1つは、「計算をデータに適用する」という考え方です。

簡単に言えば、Geodeは純粋なJavaであるため、it’s easy for us to not only send data but also logic to perform on that data.

これは、PL-SQLやTransact-SQLのようなSQL拡張のアイデアを思い起こさせるかもしれません。

7.1. 関数を定義する

Geodeが実行する作業単位を定義するために、 weはGeodeのFunctionインターフェースを実装します。

たとえば、すべての顧客の名前を大文字に変更する必要があるとします。

データをクエリしてアプリケーションに作業を行わせる代わりに、Functionを実装するだけで済みます。

public class UpperCaseNames implements Function {
    @Override
    public void execute(FunctionContext context) {
        RegionFunctionContext regionContext = (RegionFunctionContext) context;
        Region region = regionContext.getDataSet();

        for ( Map.Entry entry : region.entrySet() ) {
            Customer customer = entry.getValue();
            customer.setFirstName(customer.getFirstName().toUpperCase());
        }
        context.getResultSender().lastResult(true);
    }

    @Override
    public String getId() {
        return getClass().getName();
    }
}

getId は一意の値を返す必要があるため、通常はクラス名を選択することをお勧めします。

FunctionContextにはすべての地域データが含まれているため、そこからより高度なクエリを実行したり、ここで行ったように変更したりできます。

そして、Functionはこれよりもはるかに強力なので、the official manual、特にthe getResultSender methodをチェックしてください。

7.2. 機能の展開

Geodeを実行できるようにするには、関数を認識させる必要があります。カスタムデータ型で行ったように、jarをパッケージ化します。

ただし、今回は、deployコマンドを使用できます。

gfsh> deploy --jar=./lib/apache-geode-1.0-SNAPSHOT.jar

7.3. 機能の実行

これで、FunctionService:を使用してアプリケーションからFunctionを実行できます。

@Test
public void whenExecuteUppercaseNames_thenCustomerNamesAreUppercased() {
    Execution execution = FunctionService.onRegion(this.customerRegion);
    execution.execute(UpperCaseNames.class.getName());
    Customer customer = this.customerRegion.get(new CustomerKey(1));
    assertEquals("GHEORGE", customer.getFirstName());
}

8. 結論

この記事では、Apache Geodeエコシステムの基本的な概念を学びました。標準型とカスタム型、複製およびパーティション化された領域、oqlおよび関数のサポートを使用した単純なgetおよびputを調べました。

そしていつものように、これらのサンプルはすべてover on GitHubで利用できます。

TOC

Apache Geodeのクイックガイド