'Converting SAS's PROC SQL into Python pandas library (with where clause)

I am determining the weights of stocks at the end of each month (e.g., 2016-12-29), and invest in those stocks for the following month (e.g., from 2017-01-02 to 2017-01-31) and calculate portfolio's return (weights times ret for given tickers for a given month-end date).

In order to do so, I was wondering a left join given below, in SAS code;

proc sql;
create table merged as
select a.*, b.weights 
from prices as a, weights as b
where
a.ticker = b.ticker
and
a.date between b.date_st and b.date_end;
quit;

where table formats (in DataFrames) are given as follows.

Table prices (other than first three observations are omitted for brevity):

                    PX_LAST       ret 
ticker  date                                             
A069500 2002-10-14   5631.0       NaN
        2002-10-15   5769.0  0.024507
        2002-10-16   5791.0  0.003813
...

which contains multiple tickers and multiple dates for a given ticker. (PX_LAST means stock's closing price and ret means its daily return.)

Table weights (cross-sectional table; no observation is omitted):

                     weight  date_st    date_end
ticker  date
A069500 2016-12-29  2.00E-01    2017-01-02  2017-01-31
A130680 2016-12-29  1.00E-01    2017-01-02  2017-01-31
A132030 2016-12-29  1.45E-11    2017-01-02  2017-01-31
A136340 2016-12-29  1.34E-11    2017-01-02  2017-01-31
A138230 2016-12-29  9.22E-12    2017-01-02  2017-01-31
A143850 2016-12-29  2.00E-01    2017-01-02  2017-01-31

Is it possible to query the above SAS-SQL in Python's Pandas library? I searched for examples, but couldn't find the one with

dealing with two (or more) tables, and
"between" condition in where clause.

Maybe I could slice prices table and weights table by each month-end date and re-join them altogether, but that's seemingly highly ineffective.

Any thoughts or comments would be much appreciated.

(Sorry for not posting what I've tried; I couldn't come up with any valid Python-equivalent example of SAS SQL.)

(Edited) Expected output would be

                   PX_LAST  ret   weight
ticker      date        
A069500 2017-01-02  ...         2.00E-01
        2017-01-03  ...         2.00E-01
        2017-01-04  ...         2.00E-01
    …       
A069500 2017-01-31  ...         2.00E-01
A130680 2017-01-02  ...         1.00E-01
    …       
A130680 2017-01-31  ...         1.00E-01

(values of columns PX_LAST, ret of the table prices are omitted here for brevity.)

Results of prices.head(5).to_dict('list') and weights.head(5).to_dict('list') are

{'PX_LAST': [5631.0, 5769.0, 5791.0, 5871.0, 6125.0],
 'ret': [nan,
  0.024507192328183214,
  0.0038134858727683074,
  0.013814539803142845,
  0.04326349855220579]},

and

{'weights': [0.1999999999329153,
  0.09999999999852502,
  1.4490077570181562e-11,
  1.3382147933394332e-11,
  9.215515264984838e-12],
 'date_st': [Timestamp('2016-01-12 00:00:00'),
  Timestamp('2016-01-12 00:00:00'),
  Timestamp('2016-01-12 00:00:00'),
  Timestamp('2016-01-12 00:00:00'),
  Timestamp('2016-01-12 00:00:00')],
 'date_end': [Timestamp('2016-12-29 00:00:00'),
  Timestamp('2016-12-29 00:00:00'),
  Timestamp('2016-12-29 00:00:00'),
  Timestamp('2016-12-29 00:00:00'),
  Timestamp('2016-12-29 00:00:00')]}

, respectively.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'Converting SAS's PROC SQL into Python pandas library (with where clause)

Sources

Related Questions