Blog

How to work with AWS Cloudsearch PHP SDK

Blog also covers AWS Cloudsearch basic operations with PHP SDK AWS Cloudsearch Drawbacks AWS Cloudsearch Query AWS Cloudsearch Suggester   Cloudsearch is an AWS service which helps in searching large […]

Vikrant Sundriyal | 1 March , 2016 , 3 years ago

How to work with AWS Cloudsearch PHP SDK

Blog also covers
AWS Cloudsearch basic operations with PHP SDK
AWS Cloudsearch Drawbacks
AWS Cloudsearch Query
AWS Cloudsearch Suggester

 

Cloudsearch is an AWS service which helps in searching large collections of data such as documents, web pages, posts etc.
While integrating cloud search for one of my project, I found there is very limited information available on how to use cloudsearch php sdk. Most of my queries on SDK were not answered convincingly even by AWS engineers. Almost all the examples given in AWS docs and elsewhere explain examples using URL based search which may
not be apt for every scenario/project.
Through this blog I want to give simple snippets which can be referred by someone who’s struggling with php sdk. I will also try to list down few issues which a developer may face with AWS Cloudsearch.

Primary tasks with any search engines are- How to upload documents, how to search and how to have suggestions (auto completions).
Below are simple snippets to work with cloud search:
Upload document:
$CSclient = CloudSearchDomainClient::factory(array(
‘credentials’ => array(
‘key’ => ‘YOUR KEY’,
‘secret’ => ‘YOUR SECRET KEY’,
),
‘endpoint’ => ‘YOUR END POINT’,
));

$parameter = array(array(
‘type’ => ‘add’,
‘id’ => ‘a0687322-5d77-2411-cfc2232d-54e618378373’, /// if you don’t pass id, cloud search auto generates
‘fields’ => array(
‘field1′ => ‘ABC’,
‘field2′ => ‘XYZ’,
‘field3_integer’ => ‘0’, // integer
‘field4_date’ => ‘2015-06-12T00:00:00Z’, //date format…
),) ,// you can add more fields as per your domain
);
$json = json_encode($parameter);
print_r ($json);
$response = $CSclient -> uploadDocuments(
array(
‘documents’ => $json,
‘contentType’ => ‘application/json’
)
);
if ($response -> get(‘status’) === ‘success’) {
echo $response -> get(‘adds’) . “\r\n”;
} else {
};
//print_r ($result);
?>
Search document: ( have added clause for range and filter conditions, taken property search domain example)
$CSclient-> as above
$result = $CSclient->search(array(
‘query’ => ‘YOUR TEXT TO BE QUERIED’,
‘return’=> ‘property_name,property_city,property_locationname,property_area,property_bedroom,property_bathroom,price,property_for,property_image,property_address’, // fields to be queried
‘queryOptions’=> ‘{fields:[\’property_address\’,\’property_name\’, \’property_type\’, \’property_for\’, \’description\’, \’property_city\’, \’property_landmark\’, \’property_locationname\’, \’property_state\’, \’property_type\’, \’property_bedroom\’]}’, //fields to be considered for search
‘filterQuery’=> ‘deleted:0’, // query criteria- where deleted =0
‘filterQuery’=> ‘price:[\’30000000\’,\’50000000\’]’ /// query criteria price range
));
$hitCount = $result->getPath(‘hits/found’);
echo “Number of Hits: {$hitCount}\n”;
print_r($result);
//print_r ($result);
//var_dump($result, $result = null);
?>
Suggest document:
$CSclient -> as above
$result = $CSclient->suggest(array(‘query’ => ’YOUR TEXT TO BE AUTO COMPLETED’, ‘suggester’ => ‘YOUR SUGGESTER NAME’, ‘size’ => 150));// size -> max count
//$hitCount = $result->getPath(‘hits/found’);
//echo “Number of Hits: {$hitCount}\n”;
print_r ($result);
?>
===============

With all the advantages and ease cloudsearch provides to developers, i found there are certain issues worth documenting which you may hit if its used extensively.

1) ‘Small’ is the smallest instance type you can have for a cloud search domain. This increases monthly domain cost.
2) Once domain is launched, there is no option to stop or pause to reduce billing.
3) Every insert in charged which may result in high cost if you have 2,3 test/prod environments.
4) Cloud search tokenises text based on spaces in between or commas which may result in some unexpected results if you are trying to search a text with spaces or special characters.
5) There is no easy export feature to take data out from cloudsearch.
6) Number of columns in cloud search are pre configured (has schema) unlike elastic search. Having fixed schema may not be suitable if number of fields are not fixed for certain domain.