Monday, August 28, 2017

GSOC 2017: FINAL EVALUATION

Short description of the project:

Implementation of:
  •  a new construction called TABLE VALUE CONSTRUCTOR (TVC)
  • optimization that transforms IN predicate into IN subquery that uses TVC.
What's done:
for TVC implementation:

-construct of TVC (including the case when TVC is used as derived table)
Non-terminals in sql_yacc.yy: table_value_constructor, derived_table_value_constructor
 -class table_value_constr
 Includes the list of values that defines this TVC
-TVC handling  
Pointer to table_value_constr is a member of st_select_lex class. As TVC is needed to be materialized it is handled as UNION.
For the class table_value_constr the methods prepare, exec and optimize are written.
-print method for TVC
-tests: table_value_constr.test
-comments on each new method
*TVC can be used in all constructions except subqueries. Due to the specific implementation of subqueries TVC can be used in a subquery only when it is wrapped into a derived table.

for optimization implementation:
-global variable that controls optimization working (in_subquery_conversion_threshold)
It works only when the total number of scalar values used in IN predicate is more than this variable
-creation of TVC specified by values from IN predicate
-creation of new statement SELECT * FROM (VALUES values) AS tvc_0
Methods:  make_new_subselect_for_tvc
-creation of IN-subselect defined by this statement
Transformer: Item_func_in::in_predicate_to_in_subs_transformer
-tests: opt_tvc.test
-comments on each new method

Optimization works when IN predicate is used in WHERE clause of the query or when it is ON expression of some join. It also works when IN predicate is in derived tables, CTEs and subqueries.

New files added:
sql_tvc.cc, sql_tvc.h
table_value_constr.test
opt_tvc.test

Monday, August 22, 2016

Final evaluation

Short description of the project:

Implementation of the following optimization of the queries that use non-mergiable views and/or derived tables.
The conditions in the WHERE clause of such a query that depend only on the columns of non-mergiable view/derived
tables are pushed into the query defining this view/derived table.
In the general case conditions can be pushed only in the HAVING clause of the non-mergiable view/derived tables,
but at some conditions it makes sence to push them into the WHERE clause.

What's done:

-building item clones;
Methods: virtual build_clone, virtual get_copy.

-pushing conditions into HAVING;
Methods:
  1. TABLE_LIST::check_pushable_cond_for_table:
    Mark subformulas of a condition unusable for the condition pushed into table.
  2. TABLE_LIST::build_pushable_cond_for_table:
    Build condition extractable from the given one depended only on this table.

-pushing conditions into WHERE;

Methods:
  1. st_select_lex::check_cond_extraction_for_grouping_fields:
    For a condition check possibility of exraction a formula over grouping fields.
  2. st_select_lex::extract_cond_for_grouping_fields:
    Build condition extractable from the given one.

-pushing conditions into embedded derived tables;
-using equalities to extract pushable conditions;
-implementation of the case when derived table with UNION is used;
-tests:
  derived_cond_pushdown.test;
-comments to each new method;


The code is ready to be merged into MariaDB 10.2.

Here is the list of commits on github:


https://github.com/MariaDB/server/compare/10.2...shagalla:10.2-mdev9197

And here is the commit with the consolidated patch:

https://github.com/MariaDB/server/compare/10.2...shagalla:10.2-mdev9197-cons

Saturday, May 21, 2016

Building items clones

Finially I finished my project on recursive CTE, so I can resume working on my project of GSOC 2016.

As I said in my previous post, I decided to start with building items clones (items are used to build different kinds of expressions in MySQL).

Items are typical tree structures. If I just copy a node in the item structure, pointers to the subtrees won't be right. Fortunately this pointers can be fixed in the method Item_func_or_sum::build_clone. To copy nodes we use copy constructor. As we don't know the type of the node in general the copy constructor should be virtual. I called it get_copy. It should be implemented for each terminal class of items. There are dozens of such classes.

How they could be caught?

I used the following trick.
My get_copy for the base class Item is 'pseudo-abstract':

Item *tem::get_copy(...) { dbug_print_item(this); DBUG_ASSERT(0); return 0; }

Here dbug_print_item(this) helps me to understand for which class get_copy is missed.
If I call now method build_clone, for example in JOIN::optimize_inner(), to build a clone for the WHERE condition, and launch tests, it'll be easy for me to understand for which classes lack implementations of get_copy.

Here I've faced some minor problem: how to skip the queries that are excuted by mtr before it starts running the tests.
Anybody knows?
(Now I have a workaround: call build_clone() conditinally and manually trigger the condition. Of course it's not nice.)

Thursday, May 5, 2016

Beginning

First of all I went on github (my nick there is shagalla) and cloned a server code from my tree into the new branch. I called it 10.2-mdev9197 and I'm going to do all my work on it. As I had worked with MariaDB server before on my previous projects (on implementation of common table expressions), building server code was really not so difficult for me.
So I solved my first task successfully during the community bonding period!

Now it's time to talk about my project in detail.
If you are interested in it you can read about it on jira, just click this link -> click 

As you see, it's an optimization work and its main goal is to make the query faster. Now it's slow because any view/derived table defined by a grouping query (look at the example on jira) is first materialized in a temporary table.

To solve this problem I suggested to divide my work into a few issues:

1. Implementation of the case with separable conditions (conjunctive conditions that depend only on view/derived table columns)
2. Implementation of the case with non-separable conditions (there are no conjunctive conditions depending only on the columns of view/derived table, but still some restrictive condition depending only on view/derived table can be extracted from the where clause)
3. Building item clones (it's necessary for issue 2)
4. Implementation of the case with semantically separable conditions
5. Implementation of the case with separable conditions when several views are used
6. Detection of semantically separable conditions for comparison predicates.

I'm going to give you information in detail while considering each issue.

I talked with igor_seattle and we decided to start with building item clones.
In general case building a clone for an item is quite a big task (e.g. when the item contains subquery), but it's not so difficult for simple predicates and functions (like inequality, addition) if we use copy constructors.

In my next post I'm going to tell you about my progress in solving this particular problem.

Wednesday, May 4, 2016

Community bonding

Hi!

My name is Galina Shalygina and this summer I'll be working on the project for GSoC with MariaDB company. I'm so happy to be the part of GSoC, it's such an interesting experience!
The title of my project is 'Pushing conditions into non-mergeable views and derived tables in MariaDB'. My mentors on this projects will be Igor Babaev and Sergey Petrunya. 

Now, during the community bonding period (April 22 - May 22) I'm going to:

1. Clone server code and build it;
2. Read documentation;
3. Study all developments on my theme;
4. For this project I will need to be able to clone expressions, so I'll start to work on solving this problem.