Package adql.parser
Class QueryFixer
java.lang.Object
adql.parser.QueryFixer
Tool able to fix some common errors in ADQL queries.
See fix(String)
for more details.
- Since:
- 2.0
- See Also:
-
Field Summary
FieldsModifier and TypeFieldDescriptionprotected final ADQLGrammar
The used internal ADQL grammar parser.All of the most common Unicode confusable characters and their ASCII/UTF-8 alternative.protected final String
Regular expression matching all Unicode alternatives for-
.protected final String
Regular expression matching all Unicode alternatives for"
.protected final String
Regular expression matching all Unicode alternatives for=
.protected final String
Regular expression matching all Unicode alternatives for>
.protected final String
Regular expression matching all Unicode alternatives for<
.protected final String
Regular expression matching all Unicode alternatives for+
.protected final String
Regular expression matching all Unicode alternatives for'
.protected final String
Regular expression matching all Unicode alternatives forprotected final String
Regular expression matching all Unicode alternatives for.
.protected final String
Regular expression matching all Unicode alternatives for_
. -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionTry fixing tokens/terms of the given ADQL query.protected boolean
mustEscape
(Token token, Token nextToken) Tell whether the given token must be double quoted.protected String
replaceUnicodeConfusables
(String adqlQuery) Replace all Unicode characters that can be confused with other ASCI/UTF-8 characters (e.g.
-
Field Details
-
grammarParser
The used internal ADQL grammar parser. -
mapRegexUnicodeConfusable
All of the most common Unicode confusable characters and their ASCII/UTF-8 alternative.Keys of this map represent the ASCII character while the values are the regular expression for all possible Unicode alternatives.
Note: All of them have been listed using Unicode Utilities: Confusables.
-
REGEX_DASH
Regular expression matching all Unicode alternatives for-
.- See Also:
-
REGEX_UNDERSCORE
Regular expression matching all Unicode alternatives for_
.- See Also:
-
REGEX_QUOTE
Regular expression matching all Unicode alternatives for'
.- See Also:
-
REGEX_DOUBLE_QUOTE
Regular expression matching all Unicode alternatives for"
.- See Also:
-
REGEX_STOP
Regular expression matching all Unicode alternatives for.
.- See Also:
-
REGEX_PLUS
Regular expression matching all Unicode alternatives for+
.- See Also:
-
REGEX_SPACE
Regular expression matching all Unicode alternatives for- See Also:
-
REGEX_LESS_THAN
Regular expression matching all Unicode alternatives for<
.- See Also:
-
REGEX_GREATER_THAN
Regular expression matching all Unicode alternatives for>
.- See Also:
-
REGEX_EQUAL
Regular expression matching all Unicode alternatives for=
.- See Also:
-
-
Constructor Details
-
QueryFixer
- Throws:
NullPointerException
-
-
Method Details
-
fix
Try fixing tokens/terms of the given ADQL query.This function does not try to fix syntactical or semantical errors. It just try to fix the most common issues in ADQL queries, such as:
- some Unicode characters confusable with ASCII characters (like a space, a dash, ...) ; this function replace them by their ASCII alternative,
- any of the following are double quoted:
- non regular ADQL identifiers
(e.g.
_RAJ2000
), - ADQL function names used as identifiers
(e.g.
distance
) - and SQL reserved keywords
(e.g.
public
).
- non regular ADQL identifiers
(e.g.
Note: This function does not use any instance variable of this parser (especially the InputStream or Reader provided at initialisation or ReInit).
- Parameters:
adqlQuery
- The input ADQL query to fix.- Returns:
- The suggested correction of the given ADQL query.
- Throws:
ParseException
- If any unrecognised character is encountered, or if anything else prevented the tokenization of some characters/words/terms.
-
replaceUnicodeConfusables
Replace all Unicode characters that can be confused with other ASCI/UTF-8 characters (e.g. different spaces, dashes, ...) in their ASCII version.- Parameters:
adqlQuery
- The ADQL query string in which Unicode confusable characters must be replaced.- Returns:
- The same query without the most common Unicode confusable characters.
-
mustEscape
Tell whether the given token must be double quoted.This function considers all the following as terms to double quote:
- SQL reserved keywords ,
- unrecognised regular identifiers (e.g. neither a delimited nor a valid ADQL regular identifier)
- and ADQL function name without a parameters list.
- Parameters:
token
- The token to analyze.nextToken
- The following token. (useful to detect the start of a function's parameters list)- Returns:
true
if the given token must be double quoted,false
to keep it as provided.
-