bunkat · April 23, 2014 07:35
diff --git a/gistfile1.txt b/gistfile1.txt
 I took a look at the document and had the following comments.  Keep in mind I have no experience with the internals of Neo4j and only moderate experience with using Neo4j (exclusively the REST endpoint using Cypher as the query language).  I thought I would share my thoughts just in case any of it might prove useful.

 Gzip

 I didn’t see any mention of support for Gzip encoding the results.  Since the result sets for Neo4j queries can be large, there would be significant benefit to supporting compression.  I expected to see the following specified on each of the requests:

 Accept-Encoding:  gzip, compress


 Requests

 I was a bit confused by the request that was shown, particular the RETURN statements.  It looks like a literal map was requested, but not using the literal map format.  I’m not sure what the following syntax means:

 RETURN { bike, fork, front, bike.name }
  
 I was expecting something like this:

 RETURN bike, fork, front, bike.name

 Which means “return the values of bike, fork, front, and bike.name”.  Or something like this:

 RETURN {bike: bike, fork: fork, front: front, name: bike.name}

 Which means “return a literal map with properties bike, fork, front, and name with the specified values”.  If the original format was a new return syntax, I haven’t taken that into account in any of my response examples below.

 Responses

 When defining the response format, personally I would take a step back and try not to define a response format.  There is no single response format that will meet the needs of every developer, thus trying to create the perfect response format doesn’t seem solvable.  For example, none of the formats in the specification help my use cases at all – they would all be expensive to decode.  They also all seem to try and keep the column information separate from the data and I’m not sure why.  That is definitely not a common way to express JSON objects and I’m not sure what the benefit would be (but maybe I’m missing something). 

 Luckily, Cypher 2.0 is so well designed and so flexible that there is no need for a response format, instead you could just provide the tools for developers to define exactly how they want the data returned.  Turns out that this approach can be pretty flexible.

 I used the following principles to think about how this might work:
 •	The responses must be valid JSON (please don’t use YAML!)
 •	There must be some way to produce a response that can be directly decoded into native objects whatever shape those native objects take
 •	Responses should only include what is specified by the Cypher RETURN statement and nothing else 
 •	No additional keywords should be added to Cypher

 Given these principles, I came up with a few simple rules with how you could define the response format using Cypher:
 •	By default results are passed back as a simple array of values
 •	Using the AS keyword groups the results into a map (JSON object) under the key name provided
 •	When all results are grouped using AS, there is no need to contain the single map within an array

 Here are a bunch of examples that show the different permutations:

 Basic return of values

 {
  "statements" : [ {
    "statement" : "CREATE (bike:Bike { weight: 10, name:\"mine\" }) 
        CREATE ( frontWheel:Wheel { spokes: 3 } )
        CREATE (backWheel:Wheel { spokes: 32 } )
        CREATE front=(bike)-[fork:HAS {position: 1}]->(frontWheel) 
        RETURN bike, fork, bike.name"
  } ]
 }

 Since AS was not specified, the data values are returned using a simple array of values.  Note that only the requested data is returned, no extra data is included as part of some ‘default’ response format.  The user requested the values of the bike, fork, and bike name and that is exactly what they got.

 200: OK
 [ /* array of statement results */
  [ /* array of query results */
    [ /* array of return values */
      {
        "id": "16",
        "labels": [
          "Bike"
        ],
        "properties": {
          "weight": 10
        }
      },
      {
        "id": "9",
        "type": "HAS",
        "startNode": "16",
        "endNode": "17",
        "properties": {
          "position": 1
        }
      },
      "mine"
    ]
  ]
 ]

 Return values using the AS keyword

 {
  "statements" : [ {
    "statement" : "CREATE (bike:Bike { weight: 10, name:\"mine\" }) 
        CREATE ( frontWheel:Wheel { spokes: 3 } )
        CREATE (backWheel:Wheel { spokes: 32 } )
        CREATE front=(bike)-[fork:HAS {position: 1}]->(frontWheel) 
        RETURN bike AS model, fork AS part, bike.name AS name"
  } ]
 }

 Using the AS keyword acts as grouping function and groups results in a literal map using the specified label. In this case, all of the values are being grouped and so there is no need to contain the resulting map inside an array. This example makes it a little hard to see, but if there were multiple bike nodes, they would be returned under the “model” key as part of the array value.

 200: OK
 [ /* array of statement results */
  { /* map of query results */
    "model": [{
      "id": "16",
      "labels": [
        "Bike"
      ],
      "properties": {
        "weight": 10
      }
    }],
    "part": [{
      "id": "9",
      "type": "HAS",
      "startNode": "16",
      "endNode": "17",
      "properties": {
        "position": 1
      }
    }],
    "name": ["mine"]
  }
 ]

 Return literal map

 {
  "statements" : [ {
    "statement" : "CREATE (bike:Bike { weight: 10, name:\"mine\" }) 
        CREATE ( frontWheel:Wheel { spokes: 3 } )
        CREATE (backWheel:Wheel { spokes: 32 } )
        CREATE front=(bike)-[fork:HAS {position: 1}]->(frontWheel) 
        RETURN {bike: bike, part: fork, name: bike.name}
  } ]
 }

 Without specifying AS, literal maps are returned like the simple array of values.  This in particular is the holy grail of response formats that I need for my scenarios.  Everything I do returns an array of literal maps that are immediately decoded into native objects.

 200: OK
 [ /* array of statement results */
  [ /* array of query results */
    { /* literal map result */
      "bike": {
        "id": "16",
        "labels": [
          "Bike"
        ],
        "properties": {
          "weight": 10
        }
      },
      "part": {
        "id": "9",
        "type": "HAS",
        "startNode": "16",
        "endNode": "17",
        "properties": {
          "position": 1
        }
      },
      "name": "mine"
    }
  ]
 ]

 Return literal maps using the AS keyword

 {
  "statements" : [ {
    "statement" : "CREATE (bike:Bike { weight: 10, name:\"mine\" }) 
        CREATE ( frontWheel:Wheel { spokes: 3 } )
        CREATE (backWheel:Wheel { spokes: 32 } )
        CREATE front=(bike)-[fork:HAS {position: 1}]->(frontWheel) 
        RETURN {bike: bike, part: fork, name: bike.name} AS results
  } ]
 }

 The map results are grouped under the specified label similar to how values were in previous examples.

 200: OK
 [ /* array of statement results */
  { /* map of grouped results */
    "results": [ /* array of return results grouped by AS */
      {
        "bike": {
          "id": "16",
          "labels": [
            "Bike"
          ],
          "properties": {
            "weight": 10
          }
        },
        "part": {
          "id": "9",
          "type": "HAS",
          "startNode": "16",
          "endNode": "17",
          "properties": {
            "position": 1
          }
        },
        "name": "mine"
      }
    ]
  }
 ]

 Returning an array of map literals

 {
  "statements" : [ {
    "statement" : "CREATE (bike:Bike { weight: 10, name:\"mine\" }) 
        CREATE ( frontWheel:Wheel { spokes: 3 } )
        CREATE (backWheel:Wheel { spokes: 32 } )
        CREATE front=(bike)-[fork:HAS {position: 1}]->(frontWheel) 
        RETURN {bike: bike, part: fork}, {name: bike.name}
  } ]
 }

 If you need to return multiple literal maps, you can return them in a similar manner to returning multiple values. They will simply be returned as a nested array within the query results.

 200: OK
 [ /* array of statement results */
  [ /* array of query results */
    [ /* array of return values */
      {
        "bike": {
          "id": "16",
          "labels": [
            "Bike"
          ],
          "properties": {
            "weight": 10
          }
        },
        "part": {
          "id": "9",
          "type": "HAS",
          "startNode": "16",
          "endNode": "17",
          "properties": {
            "position": 1
          }
        }
      },
      {
        "name": "mine"
      }
    ]
  ]
 ]

 Returning a map of map literals

 {
  "statements" : [ {
    "statement" : "CREATE (bike:Bike { weight: 10, name:\"mine\" }) 
        CREATE ( frontWheel:Wheel { spokes: 3 } )
        CREATE (backWheel:Wheel { spokes: 32 } )
        CREATE front=(bike)-[fork:HAS {position: 1}]->(frontWheel) 
        RETURN {bike: bike, part: fork} AS things, {name: bike.name} AS owners
  } ]
 }

 If you need to return multiple literal maps, you can also group them using the AS keyword.

 200: OK
 [ /* array of statement results */
  { /* map of grouped results */
    "things": [ /* array of thing results */
      {
        "bike": {
          "id": "16",
          "labels": [
            "Bike"
          ],
          "properties": {
            "weight": 10
          }
        },
        "part": {
          "id": "9",
          "type": "HAS",
          "startNode": "16",
          "endNode": "17",
          "properties": {
            "position": 1
          }
        },
        "name": "mine"
      }
    ],
    "owners": [  /* array of owner results */
      { "name": "mine" }
    ]
  }
 ]


 Returning data using the column/data format.

 {
  "statements" : [ {
    "statement" : "CREATE (bike:Bike { weight: 10, name:\"mine\" }) 
        CREATE ( frontWheel:Wheel { spokes: 3 } )
        CREATE (backWheel:Wheel { spokes: 32 } )
        CREATE front=(bike)-[fork:HAS {position: 1}]->(frontWheel) 
        RETURN ["bike", "fork", "bike.name"] AS columns,
               [bike, fork, bike.name] AS data"
  } ]
 }

 If you need the column data format that is currently used by Neo4j responses, you can return it yourself using array literals and the AS keyword. 

 200: OK
 [ /* array of statement results */
  { /* map of return results */
    columns: ["bike", "fork", "bike.name"],
    data: [
      [
        {
          "id": "16",
          "labels": [
            "Bike"
          ],
          "properties": {
            "weight": 10
          }
        },
        {
          "id": "9",
          "type": "HAS",
          "startNode": "16",
          "endNode": "17",
          "properties": {
            "position": 1
          }
        },
        "mine"
      ]
    ]
  }
 ]


 Returning a mix of grouped and ungrouped values

 {
  "statements" : [ {
    "statement" : "CREATE (bike:Bike { weight: 10, name:\"mine\" }) 
        CREATE ( frontWheel:Wheel { spokes: 3 } )
        CREATE (backWheel:Wheel { spokes: 32 } )
        CREATE front=(bike)-[fork:HAS {position: 1}]->(frontWheel) 
        RETURN {bike: bike, part: fork} AS thing, bike.name
  } ]
 }

 This is an odd case and could just be disallowed. However, it reads as "for each result return thing and bike.name as elements, with thing being a group of literal map results" but in that case that group would always just contain a single element (result).  The response would end up looking like the following, but probably wouldn’t be all that useful.

 200: OK
 [
  [
    [
      {
        "thing": [{
          "bike": {
            "id": "16",
            "labels": [
              "Bike"
            ],
            "properties": {
              "weight": 10
            }
          },
          "part": {
            "id": "9",
            "type": "HAS",
            "startNode": "16",
            "endNode": "17",
            "properties": {
              "position": 1
            }
          }
        }]
      },
      "mine"
    ]
  ]
 ]

 JSON representation of nodes, relationships, and paths

 The other thing that needs to be defined is a suitable JSON representation when returning nodes, relationships, and paths.  Honestly though, in my experience that’s pretty secondary.   I’ve built large systems with Neo4j and have no idea what the current JSON representation of these things currently looks like since the Cypher Ref Card tells you to avoid returning them anyways.
	I took a look at the document and had the following comments. Keep in mind I have no experience with the internals of Neo4j and only moderate experience with using Neo4j (exclusively the REST endpoint using Cypher as the query language). I thought I would share my thoughts just in case any of it might prove useful.

	Gzip

	I didn’t see any mention of support for Gzip encoding the results. Since the result sets for Neo4j queries can be large, there would be significant benefit to supporting compression. I expected to see the following specified on each of the requests:

	Accept-Encoding: gzip, compress


	Requests

	I was a bit confused by the request that was shown, particular the RETURN statements. It looks like a literal map was requested, but not using the literal map format. I’m not sure what the following syntax means:

	RETURN { bike, fork, front, bike.name }

	I was expecting something like this:

	RETURN bike, fork, front, bike.name

	Which means “return the values of bike, fork, front, and bike.name”. Or something like this:

	RETURN {bike: bike, fork: fork, front: front, name: bike.name}

	Which means “return a literal map with properties bike, fork, front, and name with the specified values”. If the original format was a new return syntax, I haven’t taken that into account in any of my response examples below.

	Responses

	When defining the response format, personally I would take a step back and try not to define a response format. There is no single response format that will meet the needs of every developer, thus trying to create the perfect response format doesn’t seem solvable. For example, none of the formats in the specification help my use cases at all – they would all be expensive to decode. They also all seem to try and keep the column information separate from the data and I’m not sure why. That is definitely not a common way to express JSON objects and I’m not sure what the benefit would be (but maybe I’m missing something).

	Luckily, Cypher 2.0 is so well designed and so flexible that there is no need for a response format, instead you could just provide the tools for developers to define exactly how they want the data returned. Turns out that this approach can be pretty flexible.

	I used the following principles to think about how this might work:
	• The responses must be valid JSON (please don’t use YAML!)
	• There must be some way to produce a response that can be directly decoded into native objects whatever shape those native objects take
	• Responses should only include what is specified by the Cypher RETURN statement and nothing else
	• No additional keywords should be added to Cypher

	Given these principles, I came up with a few simple rules with how you could define the response format using Cypher:
	• By default results are passed back as a simple array of values
	• Using the AS keyword groups the results into a map (JSON object) under the key name provided
	• When all results are grouped using AS, there is no need to contain the single map within an array

	Here are a bunch of examples that show the different permutations:

	Basic return of values

	{
	"statements" : [ {
	"statement" : "CREATE (bike:Bike { weight: 10, name:\"mine\" })
	CREATE ( frontWheel:Wheel { spokes: 3 } )
	CREATE (backWheel:Wheel { spokes: 32 } )
	CREATE front=(bike)-[fork:HAS {position: 1}]->(frontWheel)
	RETURN bike, fork, bike.name"
	} ]
	}

	Since AS was not specified, the data values are returned using a simple array of values. Note that only the requested data is returned, no extra data is included as part of some ‘default’ response format. The user requested the values of the bike, fork, and bike name and that is exactly what they got.

	200: OK
	[ /* array of statement results */
	[ /* array of query results */
	[ /* array of return values */
	{
	"id": "16",
	"labels": [
	"Bike"
	],
	"properties": {
	"weight": 10
	}
	},
	{
	"id": "9",
	"type": "HAS",
	"startNode": "16",
	"endNode": "17",
	"properties": {
	"position": 1
	}
	},
	"mine"
	]
	]
	]

	Return values using the AS keyword

	{
	"statements" : [ {
	"statement" : "CREATE (bike:Bike { weight: 10, name:\"mine\" })
	CREATE ( frontWheel:Wheel { spokes: 3 } )
	CREATE (backWheel:Wheel { spokes: 32 } )
	CREATE front=(bike)-[fork:HAS {position: 1}]->(frontWheel)
	RETURN bike AS model, fork AS part, bike.name AS name"
	} ]
	}

	Using the AS keyword acts as grouping function and groups results in a literal map using the specified label. In this case, all of the values are being grouped and so there is no need to contain the resulting map inside an array. This example makes it a little hard to see, but if there were multiple bike nodes, they would be returned under the “model” key as part of the array value.

	200: OK
	[ /* array of statement results */
	{ /* map of query results */
	"model": [{
	"id": "16",
	"labels": [
	"Bike"
	],
	"properties": {
	"weight": 10
	}
	}],
	"part": [{
	"id": "9",
	"type": "HAS",
	"startNode": "16",
	"endNode": "17",
	"properties": {
	"position": 1
	}
	}],
	"name": ["mine"]
	}
	]

	Return literal map

	{
	"statements" : [ {
	"statement" : "CREATE (bike:Bike { weight: 10, name:\"mine\" })
	CREATE ( frontWheel:Wheel { spokes: 3 } )
	CREATE (backWheel:Wheel { spokes: 32 } )
	CREATE front=(bike)-[fork:HAS {position: 1}]->(frontWheel)
	RETURN {bike: bike, part: fork, name: bike.name}
	} ]
	}

	Without specifying AS, literal maps are returned like the simple array of values. This in particular is the holy grail of response formats that I need for my scenarios. Everything I do returns an array of literal maps that are immediately decoded into native objects.

	200: OK
	[ /* array of statement results */
	[ /* array of query results */
	{ /* literal map result */
	"bike": {
	"id": "16",
	"labels": [
	"Bike"
	],
	"properties": {
	"weight": 10
	}
	},
	"part": {
	"id": "9",
	"type": "HAS",
	"startNode": "16",
	"endNode": "17",
	"properties": {
	"position": 1
	}
	},
	"name": "mine"
	}
	]
	]

	Return literal maps using the AS keyword

	{
	"statements" : [ {
	"statement" : "CREATE (bike:Bike { weight: 10, name:\"mine\" })
	CREATE ( frontWheel:Wheel { spokes: 3 } )
	CREATE (backWheel:Wheel { spokes: 32 } )
	CREATE front=(bike)-[fork:HAS {position: 1}]->(frontWheel)
	RETURN {bike: bike, part: fork, name: bike.name} AS results
	} ]
	}

	The map results are grouped under the specified label similar to how values were in previous examples.

	200: OK
	[ /* array of statement results */
	{ /* map of grouped results */
	"results": [ /* array of return results grouped by AS */
	{
	"bike": {
	"id": "16",
	"labels": [
	"Bike"
	],
	"properties": {
	"weight": 10
	}
	},
	"part": {
	"id": "9",
	"type": "HAS",
	"startNode": "16",
	"endNode": "17",
	"properties": {
	"position": 1
	}
	},
	"name": "mine"
	}
	]
	}
	]

	Returning an array of map literals

	{
	"statements" : [ {
	"statement" : "CREATE (bike:Bike { weight: 10, name:\"mine\" })
	CREATE ( frontWheel:Wheel { spokes: 3 } )
	CREATE (backWheel:Wheel { spokes: 32 } )
	CREATE front=(bike)-[fork:HAS {position: 1}]->(frontWheel)
	RETURN {bike: bike, part: fork}, {name: bike.name}
	} ]
	}

	If you need to return multiple literal maps, you can return them in a similar manner to returning multiple values. They will simply be returned as a nested array within the query results.

	200: OK
	[ /* array of statement results */
	[ /* array of query results */
	[ /* array of return values */
	{
	"bike": {
	"id": "16",
	"labels": [
	"Bike"
	],
	"properties": {
	"weight": 10
	}
	},
	"part": {
	"id": "9",
	"type": "HAS",
	"startNode": "16",
	"endNode": "17",
	"properties": {
	"position": 1
	}
	}
	},
	{
	"name": "mine"
	}
	]
	]
	]

	Returning a map of map literals

	{
	"statements" : [ {
	"statement" : "CREATE (bike:Bike { weight: 10, name:\"mine\" })
	CREATE ( frontWheel:Wheel { spokes: 3 } )
	CREATE (backWheel:Wheel { spokes: 32 } )
	CREATE front=(bike)-[fork:HAS {position: 1}]->(frontWheel)
	RETURN {bike: bike, part: fork} AS things, {name: bike.name} AS owners
	} ]
	}

	If you need to return multiple literal maps, you can also group them using the AS keyword.

	200: OK
	[ /* array of statement results */
	{ /* map of grouped results */
	"things": [ /* array of thing results */
	{
	"bike": {
	"id": "16",
	"labels": [
	"Bike"
	],
	"properties": {
	"weight": 10
	}
	},
	"part": {
	"id": "9",
	"type": "HAS",
	"startNode": "16",
	"endNode": "17",
	"properties": {
	"position": 1
	}
	},
	"name": "mine"
	}
	],
	"owners": [ /* array of owner results */
	{ "name": "mine" }
	]
	}
	]


	Returning data using the column/data format.

	{
	"statements" : [ {
	"statement" : "CREATE (bike:Bike { weight: 10, name:\"mine\" })
	CREATE ( frontWheel:Wheel { spokes: 3 } )
	CREATE (backWheel:Wheel { spokes: 32 } )
	CREATE front=(bike)-[fork:HAS {position: 1}]->(frontWheel)
	RETURN ["bike", "fork", "bike.name"] AS columns,
	[bike, fork, bike.name] AS data"
	} ]
	}

	If you need the column data format that is currently used by Neo4j responses, you can return it yourself using array literals and the AS keyword.

	200: OK
	[ /* array of statement results */
	{ /* map of return results */
	columns: ["bike", "fork", "bike.name"],
	data: [
	[
	{
	"id": "16",
	"labels": [
	"Bike"
	],
	"properties": {
	"weight": 10
	}
	},
	{
	"id": "9",
	"type": "HAS",
	"startNode": "16",
	"endNode": "17",
	"properties": {
	"position": 1
	}
	},
	"mine"
	]
	]
	}
	]


	Returning a mix of grouped and ungrouped values

	{
	"statements" : [ {
	"statement" : "CREATE (bike:Bike { weight: 10, name:\"mine\" })
	CREATE ( frontWheel:Wheel { spokes: 3 } )
	CREATE (backWheel:Wheel { spokes: 32 } )
	CREATE front=(bike)-[fork:HAS {position: 1}]->(frontWheel)
	RETURN {bike: bike, part: fork} AS thing, bike.name
	} ]
	}

	This is an odd case and could just be disallowed. However, it reads as "for each result return thing and bike.name as elements, with thing being a group of literal map results" but in that case that group would always just contain a single element (result). The response would end up looking like the following, but probably wouldn’t be all that useful.

	200: OK
	[
	[
	[
	{
	"thing": [{
	"bike": {
	"id": "16",
	"labels": [
	"Bike"
	],
	"properties": {
	"weight": 10
	}
	},
	"part": {
	"id": "9",
	"type": "HAS",
	"startNode": "16",
	"endNode": "17",
	"properties": {
	"position": 1
	}
	}
	}]
	},
	"mine"
	]
	]
	]

	JSON representation of nodes, relationships, and paths

	The other thing that needs to be defined is a suitable JSON representation when returning nodes, relationships, and paths. Honestly though, in my experience that’s pretty secondary. I’ve built large systems with Neo4j and have no idea what the current JSON representation of these things currently looks like since the Cypher Ref Card tells you to avoid returning them anyways.