================
= ExcelMetrics =
================

The ExcelMetrics framework is used to extract information from one or more Microsoft Excel files (.xls or .xslx). A detailed configuration file allows defining how the Excel document should be read and what information should be extracted. This framework allows importing metrics, findings and textual information to existing artefacts or artefacts that will be created by the Data Provider. 


============
= form.xml =
============

You can customise form.xml to either:
- specify the path to a single Excel file to import
- specify a pattern to import all Excel files matching this pattern in a directory

In order to import a single Excel file:
=====================================
<?xml version="1.0" encoding="UTF-8"?>
<tags baseName="ExcelMetrics" needSources="false">
	<tag type="text" key="excel" defaultValue="/path/to/mydata.xslx" />
</tags>

Notes: 
- The excel key is mandatory.

In order to import all files matching a patter in a folder:
===========================================================
<?xml version="1.0" encoding="UTF-8"?>
<tags baseName="ExcelMetrics" needSources="false">
	<!-- Root directory containing Excel files to import-->
	<tag type="text" key="dir" defaultValue="/path/to/mydata" />
	<!-- Pattern that needs to be matched by a file name in order to import it-->
	<tag type="text" key="ext" defaultValue="*.xlsx" />
	<!-- search for files in sub-folders -->
	<tag type="booleanChoice" defaultValue="true" key="sub" />
</tags>


Notes: 
- The dir and ext keys are mandatory
- The sub key is optional (and its value set to false if not specified)


==============
= config.tcl =
==============

Sample config.tcl file:
=======================
# The separator to be used in the generated csv file
# Usually \t or ; or ,
set Separator  ";"

# The delimiter used in the input CSV file
# This is normally left empty, except when you know that some of the values in the CSV file
# contain the separator itself, for example:
# "A text containing ; the separator";no problem;end
# In this case, you need to set the delimiter to \" in order for the data provider to find 3 values instead of 4.
# To include the delimiter itself in a value, you need to escape it by duplicating it, for example:
# "A text containing "" the delimiter";no problemo;end
# Default: none
set Delimiter \"

# The path separator in an artefact's path
# in the generated CSV file.
set ArtefactPathSeparator "/"

# Ignore findings for artefacts that are not part of the project (orphan findings)
# When set to 1, the findings are ignored
# When set to 0, the findings are imported and attached to the APPLICATION node
# (default: 1)
set IgnoreIfArtefactNotFound 1

# For findings of a type that is not in your ruleset, set a default rule ID.
# The value for this parameter must be a valid rule ID from your analysys model.
# (default: empty)
set UnknownRuleId UNKNOWN_RULE

# Save the total count of orphan findings as a metric at application level
# Specify the ID of the metric to use in your analysys model
# to store the information
# (default: empty)
set OrphanArteCountId NB_ORPHANS

# Save the total count of unknown rules as a metric at application level
# Specify the ID of the metric to use in your analysys model
# to store the information
# (default: empty)
set OrphanRulesCountId NB_UNKNOWN_RULES

# Save the list of unknown rule IDs as textual information at application level
# Specify the ID of the metric to use in your analysys model
# to store the information
# (default: empty)
set OrphanRulesListId UNKNOWN_RULES_INFO

# The list of the Excel sheets to read, each sheet has the number of the first line to read
# A Perl regexp pattern can be used instead of the name of the sheet (the first sheet matching 
# the pattern will be considered)
set Sheets {{Baselines 5} {ChangeNotes 5}}

# ######################
# # COMMON DEFINITIONS #
# ######################
#
# - <value> is a list of column specifications whose values will be concatened. When no column name is present, the
#         text is taken as it appears. Optional sheet name can be added (with ! char to separate from the column name)
#		Examples:
#            - {C:} the value will be the value in column C on the current row
#            - {C: B:} the value will be the concatenation of values found in column C and B of the current row
#            - {Deliveries} the value will be Deliveries
#            - {BJ: " - " BL:} the value will be the concatenation of value found in column BJ, 
#            	 string " - " and the value found in column BL fo the current row  
#            - {OtherSheet!C:} the value will be the value in column C from the sheet OtherSheet on the current row
#
#  - <condition> is a list of conditions. An empty condition is always true. A condition is a column name followed by colon, 
#              optionally followed by a perl regexp. Optional sheet name can be added (with ! char to separate from the column name)
#		Examples:
#       - {B:} the value in column B must be empty on the current row
#       - {B:.+} the value in column B can not be empty on the current row
#       - {B:R_.+} the value in column B is a word starting by R_ on the current row
#       - {A: B:.+ C:R_.+} the value in column A must be empty and the value in column B must contain something and 
#       	the column C contains a word starting with R_  on the current row
#       - {OtherSheet!B:.+} the value in column B from sheet OtherSheet on the current row can not be empty.

# #############
# # ARTEFACTS #
# #############
# The variable is a list of artefact hierarchy specification:
# {ArtefactHierarchySpec1 ArtefactHierarchySpec2 ... ArtefactHierarchySpecN}
# where each ArtefactHierarchySpecx is a list of ArtefactSpec
#
# An ArtefactSpec is a list of items, each item being:
# {<(sheetName!)?artefactType> <conditions> <name> <parentType>? <parentName>?}
# where:
#    - <(sheetName!)?artefactType>: allows specifying the type. Optional sheetName can be added (with ! char to separate from the type) to limit 
#                                  the artefact search in one specific sheet. When Sheets are given with regexp, the same regexp has to be used
#                                  for the sheetName.
#                                  If the type is followed by a question mark (?), this level of artefact is optional.
#                                  If the type is followed by a plus char (+), this level is repeatable on the next row
#    - <condition>: see COMMON DEFINITIONS
#    - <value>: the name of the artefact to build, see COMMON DEFINITIONS
#
#   - <parentType>: This element is optional. When present, it means that the current element will be attached to a parent having this type
#   - <parentValue>: This is a list like <value> to build the name of the artefact of type <parentType>. If such artefact is not found, 
#                  the current artefact does not match
#
# Note: to add metrics at application level, specify an APPLICATION artefact which will match only one line:
#       e.g. {APPLICATION {A:.+} {}} will recognize as application the line having column A not empty.
set ArtefactsSpecs {
	{
		{DELIVERY {} {Deliveries}}
		{RELEASE {E:.+} {E:}}
		{SPRINT {O:SW_Software} {Q:}}
	}	
	{
		{DELIVERY {} {Deliveries}}
		{RELEASE {O:SY_System} {Q:}}
	}
	{
		{WP {BL:.+ AF:.+} {BJ: " - " BL:} SPRINT {AF:}}
		{ChangeNotes!TASK {D:(added|changed|unchanged) T:imes} {W: AD:}}
	}
	{
		{WP {} {{Unplanned imes}} SPRINT {AF:}}
		{TASK {BL: D:(added|changed|unchanged) T:imes W:.+} {W: AD:}}
	}
}

# ###########
# # METRICS #
# ###########
# Specification of metrics to be retreived
# This is a list where each element is:
#  {<artefactTypeList> <metricId> <condition> <value> <format>}
# Where:
#     - <artefactTypeList>: the list of artefact types for which the metric has to be used
#                       each element of the list is (sheetName!)?artefactType where sheetName is used
#                       to restrict search to only one sheet. sheetName is optional.
#     - <metricId>: the name of the MeasureId to be injected into Squore, as defined in your analysis model
#     - <confition>: see COMMON DEFINITIONS above. This is the condition for the metric to be generated.
#     - <value> : see COMMON DEFINITIONS above. This is the value for the metric (can be built from multi column)
#     - <format> : optional, defaults to NUMBER
#                Possible format are: 
#							* DATE_FR, DATE_EN for date stored as string
#							* DATE for cell formatted as date
#                           * NUMBER_FR, NUMBER_EN for number stored as string
#							* NUMBER for cell formatted as number
#                           * LINES for counting the number of text lines in a cell
#     - <formatPattern> : optional
#                Only used by the LINES format.
# 				 This is a pattern (can contain perl regexp) used to filter lines to count
set MetricsSpecs {
	{{RELEASE SPRINT} TIMESTAMP {} {A:} DATE_EN}
	{{RELEASE SPRINT} DATE_ACTUAL_RELEASE {} {S:} DATE_EN}
	{{RELEASE SPRINT} DATE_FINISH {} {T:} DATE_EN}
	{{RELEASE SPRINT} DELIVERY_STATUS {} {U:}}
	{{WP} WP_STATUS {} {BO:}}
	{{ChangeNotes!TASK} IS_UNPLAN {} {BL:}}
	{{TASK WP} DATE_LABEL {} {BP:} DATE_EN}
	{{TASK WP} DATE_INTEG_PLAN {} {BD:} DATE_EN}
	{{TASK} TASK_STATUS {} {AE:}}
	{{TASK} TASK_TYPE {} {AB:}}
}

# ############
# # FINDINGS #
# ############
# This is a list where each element is:
#  {<artefactTypeList> <findingId> <condition> <value> <localisation>}
# Where:
#     - <artefactTypeList>: the list of artefact type for which the metric has to be used
#                       each element of the list is (sheetName!)?artefactType where sheetName is used
#                       to restrict search to only one sheet. sheetName is optional.
#     - <findingId>: the name of the FindingId to be injected into Squore, as defined in your analysis model
#     - <confition>: see COMMON DEFINITIONS above. This is the condition for the finding to be triggered.
#     - <value>: see COMMON DEFINITIONS above. This is the value for the message of the finding (can be built from multi column)
#     - <localisation>: this a <value> representing the localisation of the finding (free text)
set FindingsSpecs {
	{{WP} {BAD_WP} {BL:.+ AF:.+} {{This WP is not in a correct state } AF:.+} {A:}}
}

# #######################
# # TEXTUAL INFORMATION #
# #######################
# This is a list where each element is:
#  {<artefactTypeList> <infoId> <condition> <value>}
# Where:
#     - <artefactTypeList> the list of artefact types for which the info has to be used
#                       each element of the list is (sheetName!)?artefactType where sheetName is used
#                       to restrict search to only one sheet. sheetName is optional.
#     - <infoId> : is the name of the Information to be attached to the artefact, as defined in your analysis model
#     - <confition> : see COMMON DEFINITIONS above. This is the condition for the info to be generated.
#     - <value> : see COMMON DEFINITIONS above. This is the value for the info (can be built from multi column)
set InfosSpecs {
	{{TASK} ASSIGN_TO {} {XB:}}
}

# ########################
# # LABEL TRANSFORMATION #
# ########################
# This is a list value specification for MeasureId or InfoId:
#    <MeasureId|InfoId> { {<LABEL1> <value1>} ... {<LABELn> <valuen>}}
# Where:
#    - <MeasureId|InfoId> : is either a MeasureId, an InfoId, or * if it is available for every measureid/infoid
#    - <LABELx> : is the label to macth (can contain perl regexp)
#    - <valuex> : is the value to replace the label by, it has to match the correct format for the metrics (no format for infoid)
#
# Note: only metrics which are labels in the excel file or information which need to be rewriten, need to be described here.
set Label2ValueSpec {
	{
		STATUS {
			{OPENED 0}
			{ANALYZED 1}
			{CLOSED 2}
			{.* -1}
		}
	}
	{
		* {
			{FATAL 0}
			{ERROR 1}
			{WARNING 2}
			{{LEVEL:\s*0} 1}
			{{LEVEL:\s*1} 2}
			{{LEVEL:\s*[2-9]+} 3}
		}
	}
}

Note that a sample Excel file with its associated config.tcl is available in $SQUORE_HOME/addons/tools/ExcelMetrics in order to further explain available configuration options.