EMC2 running on Raspberry Pi?

More
16 Mar 2013 02:26 #31469 by metachris
I see that working, if a few percent fluctuation on the pulse-widths do not matter (eg. it's impossible to achieve 500kHz because the fluctuations add up; exact number yet unknown). I think updating pwm.c in RPIO with this functionality might be quite trivial.

Please Log in or Create an account to join the conversation.

More
16 Mar 2013 02:33 #31470 by mozmck
PCW: Ok, I see. I was wondering how/if a PWM channel could be made to do that, but if you can control GPIO directly with DMA that's another matter. Sounds like an interesting idea.

Please Log in or Create an account to join the conversation.

More
16 Mar 2013 06:20 - 16 Mar 2013 06:22 #31477 by metachris

No PWM involved, all the DMA does is copy from an array in memory to the GPIO port (1 word per usec for example) By arranging things properly I can DMA data from memory continously

So say I want a 5 usec pulse on the output, I have this data in the memory:
...
0000000
0000001
0000001
0000001
0000001
0000001
0000000
0000000
...

By varying the spacing of pulses in the memory record I can generate any step rate I want


We still use the PWM/PCM hardware for the DMA timings (set to 10MHz in here: github.com/metachris/RPIO/blob/master/source/c_pwm/pwm.c#L599), and the DMA writes our memory to the GPIOs. Apps using this would need to write the memory pretty well timed, in order to avoid duplicate step instructions. We could possibly program the DMA part so it does not repeat steps once they were executed (lets hope we do not need this). How exactly can we time the writes in LinuxCNC?

If this runs at 1MHz, that requires the app to write 1M of control instructions every second, and we'd have a million possible step instructions per second. What is typically needed by steppers with LinuxCNC or achieved by hardware drivers?
Last edit: 16 Mar 2013 06:22 by metachris.

Please Log in or Create an account to join the conversation.

More
16 Mar 2013 06:37 #31478 by mhaberler
Hi Chris,

the standard 'servo cycle' - the rate by which target positions are updated - is 1 kHz, but some people run this at higher rates up to maybe 10 kHz max; that would also be the rate by which stepper speed gets adjusted

PCW probably can fill in better as what maximum realistic step rates are but lets assume worst-case with fine microstepping:

I wouldnt assume step rates would exceed 100kHz

so as a ballpark figure per servo cycle you'd have maximum 100 steps per servo cycle

maybe a viable method is to use a dual-buffer scheme and switch the DMA between sample buffers at servo rate to avoid sync problems

- Michael

Please Log in or Create an account to join the conversation.

More
16 Mar 2013 07:02 #31479 by PCW
Replied by PCW on topic EMC2 running on Raspberry Pi?
we have customers using steprates into the MHz but
200 KHZ would probably make 99 % happy

I think the trick is synchronizing linuxCNC memory update to the DMA..
A circular buffer of about twice the servo thread rate is probably needed
but since LinuxCNCs servo thread actuation will not be likely synchronous to
the buffer readout (unless it generates the interrupt), I would expect that the
code that fills the buffer with the next segment of step data would need to
read the current DMA pointer and "splice" in the next segment worth of data to play out

Please Log in or Create an account to join the conversation.

More
16 Mar 2013 09:12 #31486 by andypugh

we have customers using steprates into the MHz but
200 KHZ would probably make 99 % happy


50kHz with finer than 5kHz granularity would be more than typical software systems manage.

A parport system can typically manage 20khz or 15khz or 10khz… Motors can't follow those steps.

Steppers don't often run above 500rpm. With 8x microstepping that is about 10kHz. What does matter is that 10kHz with 500Hz resolution is massively more useful than 50kHz with 20kHz resolution.

Please Log in or Create an account to join the conversation.

More
19 Apr 2013 19:34 #32915 by mungkie
I got back to looking at rpi again, but unfortunately still waiting for some parts so i can setup a cheap PIC dev environment, so I had a quick look at the most recent topic of RPIO dma as a step generator.

I have spent probably 10 hours over the last 3 days looking at code and feel like its way too complicated for me and I still do not understand what is going on properly, its an interesting problem at times but also fustrating.

I will try to explain in simple terms how I think things may work as it may help me get things organised in my mind so i understand it better and also hopefully will allow others to offer corrections.

Please make allowances as this is a hack job and I feel like i really dont understand the internals of linuxcnc properly to do this, but thought I would give it a try to help generate the intrest of other people who maybe could do this better.

I maybe have everything totally confused and do not understand corrctly, if I am wrong with any of this please feel free to explain and help me.

I feel 60% certain that I understand the rpio dma code and it works as follows...

*********************** rpio dma pwm io
dma is controlled by 32byte dma_control_blocks that are linked list of dma operations, to get the constant stepping of gpio output each 'sample' requires 2 dma_control_blocks one block holds the data for each step of gpio output and the next block is controlled by the pwm and efectivly pauses the next change of gpio until the pwm toggles.

The control block hold pointers to the :

*src memory (pointer to 32 bits that define gpio pin values),
*the dst memory (the actual gpio register),
*the amount of data to transfer (32 bits=4 bytes)
*the next block to load into the dma controller
*other stuff....

So for every gpio step we must allocate and create 64 bytes of control block data (2 control blocks)

*********************************** rough thoughts on the hal driver....

Basically I will try to integrate stepgen with the rpio pwm dma code and remove the stepgen output signals (up,down,step,dir, phasex) from hal ,this should require mods to only two functions in stepgen.c rtapi_app_main() and make_pulse(void *arg, long period) and some extra parsing of the hal file to config which pins of gpio are used by stepgen.

We need to double buffer so while the dma is writing a servo periods worth of samples the 'make_pulses' function is creating the next.

In linuxcnc we probably want 1khz servo loop and 50khz step rate so we need 100 'samples' for double buffer { sample_buffer= malloc(100* (4bytes)) }

So the setup code in rtapi_app_main() does mostly exactly same as the rpio pwm demo it sets up the hardware registers then loops through sample_buffer creating control_block linked list pointing to each sample:

for(i=0;i<buffer_len;i++){
control_block->src = (sample_buffer+offset)
control_block->dst = gpio_reg
control_block->next=control_block +1

*clock period timing control block that sets update frequency
control_block->src = pwm
control_block->dst = null
control_block->next=control_block +1
}
the final control block points to the first so loops back to start again.


Then all that needs to be done is rewrite make_pulses slightly so it writes to the correct half of the sample_buffer 50 steps everytime its called.

All stepgen functions will run at servo_thread period INCLUDING MAKE_PULSES!!!!!!!!!, I AM NOT SURE BUT ASSUME THAT 'period' variable that is argument of 'make_pulses' is the exact elapsed time in nanoseconds since make_pulses was last called by the RT thread can anyone confirm that??

The 'period' argument will possibly be used to see if over run occurs and the writing to buffer needs to be realigned, the actual step period should be constant as defined by pwm dma.

Not sure what the 'update_period' and 'update_freq' will need changed??

I hope that maybe I will get some code posted for this driver here on the forum this weekend, there are probably a lot of unseen things I still need to get my mind around, but I have tinkered with the code and started pasting things together last night. I will probably get a rought test going by the end of the weekend, but I do not have any equipment to actually test the signals for dropped steps or over runs etc, is there anyone else working on this???

Please Log in or Create an account to join the conversation.

More
20 Apr 2013 08:54 - 20 Apr 2013 09:07 #32939 by mungkie
Damn its almost 3am and I have got to a state of fugue maybe too much focus on this problem as I have lost all idea of what the hell I was even trying to do, so I am quiting for now, I may get a chance to plug some drivers into the gpios this weekend and test the code.

It compiles and loads into axis without crashing the system, I have no idea if the DMA code is doing its job, I feel like nothing will get done tomorrow maybe sunday.

I think there is only a 10% chance it will do anything when I run it, I am also sure the code is probably a terrible hack job, lots of things are hard coded and need reconsideration.

I also just realised that I have not got the hal config files here to post to the forum, but it fairly simple to make your own just use the standard stepper config but change the make-pulses addf to servo thread and change the module name from stepgen to rpi_stepgen.

Any help/advice or testing would be welcome.


#define RPI_DMA_STEP
/********************************************************************
* Description:  rpi_stepgen.c
*               This file, 'rpi_stepgen.c', is a HAL component that 
*               provides software based step pulse generation for raspberry pi via DMA.
* Author: Mung
*/
// Access from ARM Running Linux
/* gpio P1 header pins as follows..
I hate to do this but I am only going to support rev 2 boards as there is more IO and they don't make rev 1 anymore....

3.3v 		[		] 5v
gpio 2 SDA	[		] 5v
gpio 3 SCL	[		] gnd
gpio 4 gpclk0	[		] gpio14 txd
gnd		[		] gpio15 rxd
gpio17		[		] gpio18 pcm_clk
gpio27 pcm_dout	[		] gnd
gpio22		[		] gpio 23
3.3v		[		] gpio 24
gpio10 mosi	[		] gnd
gpio9 miso	[		] gpio25
gpio11 sclk	[		] gpio8 ce0
gnd		[		] gpio7 ce1

gpio P5 header pins as follows...

5v 		[		] 3.3v
gpio 28		[		] 29
gpio 30		[		] 31
gnd		[		] gnd
looks like we can get P1 2,3,4,7,8,9,10,11,12,14,15,17,18,22,23,24,25,27 
and P5 28,29,30,31 =22 io pins

simply shift the bit in for number of pin

But I think we possible want to use i2c ( 0 sda, 1 scl) SPI (7 CE0, 8 CE1, 9 mosi, 10 moso, 11 sclk) and uart (14 tx, 15 rx)
*/

#include "rtapi.h"		/* RTAPI realtime OS API */
#include "rtapi_app.h"		/* RTAPI realtime module decls */
#include "hal.h"		/* HAL public API decls */

#include <float.h>
#include "rtapi_math.h"

#define MAX_CHAN 8
#define MAX_CYCLE 10
#define USER_STEP_TYPE 13

MODULE_AUTHOR("Mung");
MODULE_DESCRIPTION("rpi DMA Step Pulse Generator for EMC HAL");
MODULE_LICENSE("GPL");
int step_type[MAX_CHAN] = { -1, -1, -1, -1, -1, -1, -1, -1 };
RTAPI_MP_ARRAY_INT(step_type,MAX_CHAN,"stepping types for up to 8 channels");
int step_pins[MAX_CHAN*5] = { -1, -1, -1, -1, -1, -1, -1, -1 , -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1};
RTAPI_MP_ARRAY_INT(step_pins,MAX_CHAN,"stepping types for up to 8 channels");

const char *ctrl_type[MAX_CHAN];
RTAPI_MP_ARRAY_STRING(ctrl_type,MAX_CHAN,"control type (pos or vel) for up to 8 channels");
int user_step_type[MAX_CYCLE] = {-1,-1,-1,-1,-1,-1,-1,-1,-1,-1};
RTAPI_MP_ARRAY_INT(user_step_type, MAX_CYCLE,
	"lookup table for user-defined step type");

/***********************************************************************
*                STRUCTURES AND GLOBAL VARIABLES                       *
************************************************************************/

/** This structure contains the runtime data for a single generator. */

/* structure members are ordered to optimize caching for makepulses,
   which runs in the fastest thread */

typedef struct {
    /* stuff that is both read and written by makepulses */
    unsigned int timer1;	/* times out when step pulse should end */
    unsigned int timer2;	/* times out when safe to change dir */
    unsigned int timer3;	/* times out when safe to step in new dir */
    int hold_dds;		/* prevents accumulator from updating */
    long addval;		/* actual frequency generator add value */
    volatile long long accum;	/* frequency generator accumulator */
    hal_s32_t rawcount;		/* param: position feedback in counts */
    int curr_dir;		/* current direction */
    int state;			/* current position in state table */
    /* stuff that is read but not written by makepulses */
    hal_bit_t *enable;		/* pin for enable rpi_stepgen */
    long target_addval;		/* desired freq generator add value */
    long deltalim;		/* max allowed change per period */
    hal_u32_t step_len;		/* parameter: step pulse length */
    hal_u32_t dir_hold_dly;	/* param: direction hold time or delay */
    hal_u32_t dir_setup;	/* param: direction setup time */
    int step_type;		/* stepping type - see list above */
    int cycle_max;		/* cycle length for step types 2 and up */
    int num_phases;		/* number of phases for types 2 and up */
    hal_bit_t *phase[5];	/* pins for output signals */
    const unsigned char *lut;	/* pointer to state lookup table */
    /* stuff that is not accessed by makepulses */
    int pos_mode;		/* 1 = position mode, 0 = velocity mode */
    hal_u32_t step_space;	/* parameter: min step pulse spacing */
    double old_pos_cmd;		/* previous position command (counts) */
    hal_s32_t *count;		/* pin: captured feedback in counts */
    hal_float_t pos_scale;	/* param: steps per position unit */
    double old_scale;		/* stored scale value */
    double scale_recip;		/* reciprocal value used for scaling */
    hal_float_t *vel_cmd;	/* pin: velocity command (pos units/sec) */
    hal_float_t *pos_cmd;	/* pin: position command (position units) */
    hal_float_t *pos_fb;	/* pin: position feedback (position units) */
    hal_float_t freq;		/* param: frequency command */
    hal_float_t maxvel;		/* param: max velocity, (pos units/sec) */
    hal_float_t maxaccel;	/* param: max accel (pos units/sec^2) */
    hal_u32_t old_step_len;	/* used to detect parameter changes */
    hal_u32_t old_step_space;
    hal_u32_t old_dir_hold_dly;
    hal_u32_t old_dir_setup;
    int printed_error;		/* flag to avoid repeated printing */
} rpi_stepgen_t;

/* ptr to array of rpi_stepgen_t structs in shared memory, 1 per channel */
static rpi_stepgen_t *rpi_stepgen_array;

/* lookup tables for stepping types 2 and higher - phase A is the LSB */

static unsigned char master_lut[][MAX_CYCLE] = {
    {1, 3, 2, 0, 0, 0, 0, 0, 0, 0},	/* type 2: Quadrature */
    {1, 2, 4, 0, 0, 0, 0, 0, 0, 0},	/* type 3: Three Wire */
    {1, 3, 2, 6, 4, 5, 0, 0, 0, 0},	/* type 4: Three Wire Half Step */
    {1, 2, 4, 8, 0, 0, 0, 0, 0, 0},	/* 5: Unipolar Full Step 1 */
    {3, 6, 12, 9, 0, 0, 0, 0, 0, 0},	/* 6: Unipoler Full Step 2 */
    {1, 7, 14, 8, 0, 0, 0, 0, 0, 0},	/* 7: Bipolar Full Step 1 */
    {5, 6, 10, 9, 0, 0, 0, 0, 0, 0},	/* 8: Bipoler Full Step 2 */
    {1, 3, 2, 6, 4, 12, 8, 9, 0, 0},	/* 9: Unipolar Half Step */
    {1, 5, 7, 6, 14, 10, 8, 9, 0, 0},	/* 10: Bipolar Half Step */
    {1, 2, 4, 8, 16, 0, 0, 0, 0, 0},	/* 11: Five Wire Unipolar */
    {3, 6, 12, 24, 17, 0, 0, 0, 0, 0},	/* 12: Five Wire Wave */
    {1, 3, 2, 6, 4, 12, 8, 24, 16, 17},	/* 13: Five Wire Uni Half */
    {3, 7, 6, 14, 12, 28, 24, 25, 17, 19},	/* 14: Five Wire Wave Half */
    {0, 0, 0, 0, 0, 0, 0, 0, 0, 0} /* 15: User-defined */
};

static unsigned char cycle_len_lut[] =
    { 4, 3, 6, 4, 4, 4, 4, 8, 8, 5, 5, 10, 10, 0 };

static unsigned char num_phases_lut[] =
    { 2, 3, 3, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 0, };

#define MAX_STEP_TYPE 15

#define STEP_PIN	0	/* output phase used for STEP signal */
#define DIR_PIN		1	/* output phase used for DIR signal */
#define UP_PIN		0	/* output phase used for UP signal */
#define DOWN_PIN	1	/* output phase used for DOWN signal */

#define PICKOFF		28	/* bit location in DDS accum */

/* other globals */
static int comp_id;		/* component ID */
static int num_chan = 0;	/* number of step generators configured */
static long periodns;		/* makepulses function period in nanosec */
static long old_periodns;	/* used to detect changes in periodns */
static double periodfp;		/* makepulses function period in seconds */
static double freqscale;	/* conv. factor from Hz to addval counts */
static double accelscale;	/* conv. Hz/sec to addval cnts/period */
static long old_dtns;		/* update_freq funct period in nsec */
static double dt;		/* update_freq period in seconds */
static double recip_dt;		/* recprocal of period, avoids divides */
static int current_buffer;
typedef enum CONTROL { POSITION, VELOCITY, INVALID } CONTROL;


#include <signal.h>

#include <fcntl.h>

#include <sys/mman.h>



void shutdown(void);

int init_channel(int channel, int subcycle_time_us);

void set_softfatal(int enabled);

int is_setup(void);

int is_channel_initialized(int channel);



// Default pulse-width-increment-granularity

#define PULSE_WIDTH_INCREMENT_GRANULARITY_US_DEFAULT 10



// 15 DMA channels are usable on the RPi (0..14)

#define DMA_CHANNELS    15

// Standard page sizes

#define PAGE_SIZE       4096

#define PAGE_SHIFT      12



// Memory Addresses

#define DMA_BASE        0x20007000

#define DMA_CHANNEL_INC 0x100

#define DMA_LEN         0x24

#define PWM_BASE        0x2020C000

#define PWM_LEN         0x28

#define CLK_BASE        0x20101000

#define CLK_LEN         0xA8

#define GPIO_BASE       0x20200000

#define GPIO_LEN        0x100

//#define PCM_BASE        0x20203000

//#define PCM_LEN         0x24



// Datasheet p. 51:

#define DMA_NO_WIDE_BURSTS  (1<<26)

#define DMA_WAIT_RESP   (1<<3)

#define DMA_D_DREQ      (1<<6)

#define DMA_PER_MAP(x)  ((x)<<16)

#define DMA_END         (1<<1)

#define DMA_RESET       (1<<31)

#define DMA_INT         (1<<2)



// Each DMA channel has 3 writeable registers:

#define DMA_CS          (0x00/4)

#define DMA_CONBLK_AD   (0x04/4)

#define DMA_DEBUG       (0x20/4)



// GPIO Memory Addresses

#define GPIO_FSEL0      (0x00/4)

#define GPIO_SET0       (0x1c/4)

#define GPIO_CLR0       (0x28/4)

#define GPIO_LEV0       (0x34/4)

#define GPIO_PULLEN     (0x94/4)

#define GPIO_PULLCLK    (0x98/4)



// GPIO Modes (IN=0, OUT=1)

#define GPIO_MODE_IN    0

#define GPIO_MODE_OUT   1



// PWM Memory Addresses

#define PWM_CTL         (0x00/4)

#define PWM_DMAC        (0x08/4)

#define PWM_RNG1        (0x10/4)

#define PWM_FIFO        (0x18/4)



#define PWMCLK_CNTL     40

#define PWMCLK_DIV      41



#define PWMCTL_MODE1    (1<<1)

#define PWMCTL_PWEN1    (1<<0)

#define PWMCTL_CLRF     (1<<6)

#define PWMCTL_USEF1    (1<<5)



#define PWMDMAC_ENAB    (1<<31)

#define PWMDMAC_THRSHLD ((15<<8) | (15<<0))



// DMA Control Block Data Structure (p40): 8 words (256 bits)

typedef struct {

    uint32_t info;   // TI: transfer information

    uint32_t src;    // SOURCE_AD

    uint32_t dst;    // DEST_AD

    uint32_t length; // TXFR_LEN: transfer length

    uint32_t stride; // 2D stride mode

    uint32_t next;   // NEXTCONBK

    uint32_t pad[2]; // _reserved_

} dma_cb_t;



// Memory mapping

typedef struct {

    uint8_t *virtaddr;

    uint32_t physaddr;

} page_map_t;



// Main control structure per channel

struct channel {

    uint8_t *virtbase;

    uint32_t *sample;

    dma_cb_t *cb;

    page_map_t *page_map;

    volatile uint32_t *dma_reg;



    // Set by user

    uint32_t subcycle_time_us;



    // Set by system

    uint32_t num_samples;

    uint32_t num_cbs;

    uint32_t num_pages;



    // Used only for control purposes

    uint32_t width_max;

};



// One control structure per channel

static struct channel channels[DMA_CHANNELS];



// Pulse width increment granularity

static uint16_t pulse_width_incr_us = -1;

static uint8_t _is_setup = 0;

static int gpio_setup = 0; // bitfield for setup gpios (setup = out/low)



// Common registers

static volatile uint32_t *pwm_reg;

static volatile uint32_t *clk_reg;

static volatile uint32_t *gpio_reg;



// Sets a GPIO to either GPIO_MODE_IN(=0) or GPIO_MODE_OUT(=1)

static void

gpio_set_mode(uint32_t pin, uint32_t mode)

{

    uint32_t fsel = gpio_reg[GPIO_FSEL0 + pin/10];



    fsel &= ~(7 << ((pin % 10) * 3));

    fsel |= mode << ((pin % 10) * 3);

    gpio_reg[GPIO_FSEL0 + pin/10] = fsel;

}





// Very short delay as demanded per datasheet

static void udelay(int us)

{

    struct timespec ts = { 0, us * 1000 };

    nanosleep(&ts, NULL);

}



// Shutdown -- its important to reset the DMA before quitting

void shutdown(void)

{

    int i;



    for (i = 0; i < DMA_CHANNELS; i++) {

        if (channels[i].dma_reg && channels[i].virtbase) {

           // log_debug("shutting down dma channel %d\n", i);

            clear_channel(i);

            udelay(channels[i].subcycle_time_us);

            channels[i].dma_reg[DMA_CS] = DMA_RESET;

            udelay(10);

        }

    }

}



// Terminate is triggered by signals

static void terminate(void)

{    shutdown();

hal_exit(comp_id);

}



static int fatal(char *fmt, ...)

{
rtapi_print_msg(RTAPI_MSG_ERR,fmt);

    shutdown();

  hal_exit(comp_id);

}



// Catch all signals possible - it is vital we kill the DMA engine on process exit!

static void setup_sighandlers(void)

{

    int i;

    for (i = 0; i < 64; i++) {

        struct sigaction sa;

        memset(&sa, 0, sizeof(sa));

        sa.sa_handler = (void *) terminate;

        sigaction(i, &sa, NULL);

    }

}



// Memory mapping

static uint32_t mem_virt_to_phys(int channel, void *virt)

{

    uint32_t offset = (uint8_t *)virt - channels[channel].virtbase;

    return channels[channel].page_map[offset >> PAGE_SHIFT].physaddr + (offset % PAGE_SIZE);

}



// Peripherals memory mapping

static void *

map_peripheral(uint32_t base, uint32_t len)

{

    int fd = open("/dev/mem", O_RDWR);

    void * vaddr;



    if (fd < 0) {

        fatal("rpio-pwm: Failed to open /dev/mem: %m\n");

        return NULL;

    }

    vaddr = mmap(NULL, len, PROT_READ|PROT_WRITE, MAP_SHARED, fd, base);

    if (vaddr == MAP_FAILED) {

        fatal("rpio-pwm: Failed to map peripheral at 0x%08x: %m\n", base);

        return NULL;

    }

    close(fd);



    return vaddr;

}



// Returns a pointer to the control block of this channel in DMA memory

uint8_t*

get_cb(int channel)

{

    return channels[channel].virtbase + (sizeof(uint32_t) * channels[channel].num_samples);

}



// Get a channel's pagemap

static int make_pagemap(int channel)

{

    int i, fd, memfd, pid;

    char pagemap_fn[64];



 //   channels[channel].page_map = malloc(channels[channel].num_pages * sizeof(*channels[channel].page_map));

 channels[channel].page_map = hal_malloc(channels[channel].num_pages * sizeof(*channels[channel].page_map));



    if (channels[channel].page_map == 0)

        return fatal("rpio-pwm: Failed to malloc page_map: %m\n");

    memfd = open("/dev/mem", O_RDWR);

    if (memfd < 0)

        return fatal("rpio-pwm: Failed to open /dev/mem: %m\n");

    pid = getpid();

    sprintf(pagemap_fn, "/proc/%d/pagemap", pid);

    fd = open(pagemap_fn, O_RDONLY);

    if (fd < 0)

        return fatal("rpio-pwm: Failed to open %s: %m\n", pagemap_fn);

    if (lseek(fd, (uint32_t)channels[channel].virtbase >> 9, SEEK_SET) !=

                        (uint32_t)channels[channel].virtbase >> 9) {

        return fatal("rpio-pwm: Failed to seek on %s: %m\n", pagemap_fn);

    }

    for (i = 0; i < channels[channel].num_pages; i++) {

        uint64_t pfn;

        channels[channel].page_map[i].virtaddr = channels[channel].virtbase + i * PAGE_SIZE;

        // Following line forces page to be allocated

        channels[channel].page_map[i].virtaddr[0] = 0;

        if (read(fd, &pfn, sizeof(pfn)) != sizeof(pfn))

            return fatal("rpio-pwm: Failed to read %s: %m\n", pagemap_fn);

        if (((pfn >> 55) & 0x1bf) != 0x10c)

            return fatal("rpio-pwm: Page %d not present (pfn 0x%016llx)\n", i, pfn);

        channels[channel].page_map[i].physaddr = (uint32_t)pfn << PAGE_SHIFT | 0x40000000;

    }

    close(fd);

    close(memfd);

    return 0; //EXIT_SUCCESS;

}



static int init_virtbase(int channel)

{

    channels[channel].virtbase = mmap(NULL, channels[channel].num_pages * PAGE_SIZE, PROT_READ|PROT_WRITE,

            MAP_SHARED|MAP_ANONYMOUS|MAP_NORESERVE|MAP_LOCKED, -1, 0);

    if (channels[channel].virtbase == MAP_FAILED)

        return fatal("rpio-pwm: Failed to mmap physical pages: %m\n");

    if ((unsigned long)channels[channel].virtbase & (PAGE_SIZE-1))

        return fatal("rpio-pwm: Virtual address is not page aligned\n");

    return 0; //EXIT_SUCCESS;

}



// Initialize control block for this channel

static int init_ctrl_data(int channel)

{

    dma_cb_t *cbp = (dma_cb_t *) get_cb(channel);

    uint32_t *sample = (uint32_t *) channels[channel].virtbase;



    uint32_t phys_fifo_addr;

    uint32_t phys_gpclr0 = 0x7e200000 + 0x28;

    int i;



    channels[channel].dma_reg = map_peripheral(DMA_BASE, DMA_LEN) + (DMA_CHANNEL_INC * channel);

    if (channels[channel].dma_reg == NULL)	hal_exit(comp_id);



        phys_fifo_addr = (PWM_BASE | 0x7e000000) + 0x18;



    // Reset complete per-sample gpio mask to 0

    memset(sample, 0, sizeof(channels[channel].num_samples * sizeof(uint32_t)));



    // For each sample we add 2 control blocks:

    // - first: clear gpio and jump to second

    // - second: jump to next CB

    for (i = 0; i < channels[channel].num_samples; i++) {

        cbp->info = DMA_NO_WIDE_BURSTS | DMA_WAIT_RESP;

        cbp->src = mem_virt_to_phys(channel, sample + i);  // src contains mask of which gpios need change at this sample

        cbp->dst = phys_gpclr0;  // set each sample to clear set gpios by default

        cbp->length = 4;

        cbp->stride = 0;

        cbp->next = mem_virt_to_phys(channel, cbp + 1);

        cbp++;



        // Delay

    //    if (delay_hw == DELAY_VIA_PWM)

            cbp->info = DMA_NO_WIDE_BURSTS | DMA_WAIT_RESP | DMA_D_DREQ | DMA_PER_MAP(5);

 

        cbp->src = mem_virt_to_phys(channel, sample); // Any data will do

        cbp->dst = phys_fifo_addr;

        cbp->length = 4;

        cbp->stride = 0;

        cbp->next = mem_virt_to_phys(channel, cbp + 1);

        cbp++;

    }



    // The last control block links back to the first (= endless loop)

    cbp--;

    cbp->next = mem_virt_to_phys(channel, get_cb(channel));



    // Initialize the DMA channel 0 (p46, 47)

    channels[channel].dma_reg[DMA_CS] = DMA_RESET; // DMA channel reset

    udelay(10);

    channels[channel].dma_reg[DMA_CS] = DMA_INT | DMA_END; // Interrupt status & DMA end flag

    channels[channel].dma_reg[DMA_CONBLK_AD] = mem_virt_to_phys(channel, get_cb(channel));  // initial CB

    channels[channel].dma_reg[DMA_DEBUG] = 7; // clear debug error flags

    channels[channel].dma_reg[DMA_CS] = 0x10880001;    // go, mid priority, wait for outstanding writes



    return 0;





// Setup a channel with a specific subcycle time. After that pulse-widths can be

// added at any time.

int init_channel(int channel, int subcycle_time_us)

{


    // Setup Data

    channels[channel].subcycle_time_us = subcycle_time_us;

    channels[channel].num_samples = channels[channel].subcycle_time_us / pulse_width_incr_us;

    channels[channel].width_max = channels[channel].num_samples - 1;

    channels[channel].num_cbs = channels[channel].num_samples * 2;

    channels[channel].num_pages = ((channels[channel].num_cbs * 32 + channels[channel].num_samples * 4 + \

                                       PAGE_SIZE - 1) >> PAGE_SHIFT);



    // Initialize channel

    if (init_virtbase(channel) == -1)	hal_exit(comp_id); //EXIT_FAILURE;

    if (make_pagemap(channel) == -1)	hal_exit(comp_id); //EXIT_FAILURE;

    if (init_ctrl_data(channel) == -1)	hal_exit(comp_id); //EXIT_FAILURE;

    return 0;

}




/***********************************************************************
*                  LOCAL FUNCTION DECLARATIONS                         *
************************************************************************/

static int export_rpi_stepgen(int num, rpi_stepgen_t * addr, int step_type, int pos_mode);
static void make_pulses(void *arg, long period);
static void update_freq(void *arg, long period);
static void update_pos(void *arg, long period);
static int setup_user_step_type(void);
static CONTROL parse_ctrl_type(const char *ctrl);
static uint32_t pinlookup [MAX_CHAN][5];

/***********************************************************************
*                       INIT AND EXIT CODE                             *
************************************************************************/

int rtapi_app_main(void)
{
int n, retval;
int pincount=0;
current_buffer=0;


    retval = setup_user_step_type();
    if(retval < 0) {
        return retval;
    }



    for (n = 0; n < MAX_CHAN && step_type[n] != -1 ; n++) {



	if ((step_type[n] > MAX_STEP_TYPE) || (step_type[n] < 0)) {
	    rtapi_print_msg(RTAPI_MSG_ERR,
			    "rpi_stepgen: ERROR: bad stepping type '%i', axis %i\n",
			    step_type[n], n);
	    return -1;
	}

//	if ((step_type[n] > MAX_STEP_TYPE) || (step_type[n] < 0))


	if(parse_ctrl_type(ctrl_type[n]) == INVALID) {
	    rtapi_print_msg(RTAPI_MSG_ERR,
			    "rpi_stepgen: ERROR: bad control type '%s' for axis %i (must be 'p' or 'v')\n",
			    ctrl_type[n], n);
	    return -1;
	}
	num_chan++;
    }
    if (num_chan == 0) {
	rtapi_print_msg(RTAPI_MSG_ERR,"rpi_stepgen: ERROR: no channels configured\n");
	return -1;
    }
    /* periodns will be set to the proper value when 'make_pulses()' runs for 
       the first time.  We load a default value here to avoid glitches at
       startup, but all these 'constants' are recomputed inside
       'update_freq()' using the real period. */


    old_periodns = periodns = 50000;
    old_dtns = 1000000;
    /* precompute some constants */
    periodfp = periodns * 0.000000001;
    freqscale = (1L << PICKOFF) * periodfp;
    accelscale = freqscale * periodfp;
    dt = old_dtns * 0.000000001;
    recip_dt = 1.0 / dt;
    /* have good config info, connect to the HAL */

    comp_id = hal_init("rpi_stepgen");


    if (comp_id < 0) {
	rtapi_print_msg(RTAPI_MSG_ERR,
			"rpi_stepgen: ERROR: hal_init() failed\n");
	return -1;
    }

    /* allocate shared memory for counter data */
    rpi_stepgen_array = hal_malloc(num_chan * sizeof(rpi_stepgen_t));
    if (rpi_stepgen_array == 0) {
	rtapi_print_msg(RTAPI_MSG_ERR,
			"rpi_stepgen: ERROR: hal_malloc() failed\n");
	hal_exit(comp_id);
return -1;
    }

    /* export all the variables for each pulse generator */
    for (n = 0; n < num_chan; n++) {
	/* export all vars */
	retval = export_rpi_stepgen(n, &(rpi_stepgen_array[n]),
	    step_type[n], (parse_ctrl_type(ctrl_type[n]) == POSITION));
	if (retval != 0) {
	    rtapi_print_msg(RTAPI_MSG_ERR,
		"rpi_stepgen: ERROR: rpi_stepgen %d var export failed\n", n);
	    hal_exit(comp_id);
	    return -1;
	}
    }

    /* export functions */
    retval = hal_export_funct("rpi_stepgen.make-pulses", make_pulses,
	rpi_stepgen_array, 0, 0, comp_id);
    if (retval != 0) {
	rtapi_print_msg(RTAPI_MSG_ERR,
	    "rpi_stepgen: ERROR: makepulses funct export failed\n");
	hal_exit(comp_id);
	return -1;
    }

    retval = hal_export_funct("rpi_stepgen.update-freq", update_freq,
	rpi_stepgen_array, 1, 0, comp_id);
    if (retval != 0) {
	rtapi_print_msg(RTAPI_MSG_ERR,
	    "rpi_stepgen: ERROR: freq update funct export failed\n");
	hal_exit(comp_id);
	return -1;
    }
    retval = hal_export_funct("rpi_stepgen.capture-position", update_pos,
	rpi_stepgen_array, 1, 0, comp_id);
    if (retval != 0) {
	rtapi_print_msg(RTAPI_MSG_ERR,
	 "rpi_stepgen: ERROR: pos update funct export failed\n");
	hal_exit(comp_id);
	return -1;
    }

#if 1
//////////////////////////////////////////////////////////////
/// setup dma buffers etc

// pulse_width_incr_us is the multiplier for the 10MHZ (100ns) loop
// it sets the frequncy that samples change at 20 = 20,000ns = 50khz
pulse_width_incr_us= 20;

{
    int channel = 0;
    int subcycle_time_us = 2000;  // 1ms servo loop time * 2 for double buffering

    if (_is_setup == 1){
rtapi_print_msg(RTAPI_MSG_ERR, "rpi_stepgen: ERROR: already setup\n");
	hal_exit(comp_id);
}

    // Catch all kind of kill signals
    setup_sighandlers();


    // Initialize common stuff
    pwm_reg = map_peripheral(PWM_BASE, PWM_LEN);
    clk_reg = map_peripheral(CLK_BASE, CLK_LEN);
    gpio_reg = map_peripheral(GPIO_BASE, GPIO_LEN);
    if (pwm_reg == NULL || clk_reg == NULL || gpio_reg == NULL)
  {
rtapi_print_msg(RTAPI_MSG_ERR,"rpi_stepgen: ERROR: pwm setup fail\n");
	hal_exit(comp_id);
}

// Start PWM/PCM timing activity
{
        // Initialise PWM
        pwm_reg[PWM_CTL] = 0;
        udelay(10);
        clk_reg[PWMCLK_CNTL] = 0x5A000006;        // Source=PLLD (500MHz)
        udelay(100);
        clk_reg[PWMCLK_DIV] = 0x5A000000 | (50<<12);    // set pwm div to 50, giving 10MHz
        udelay(100);
        clk_reg[PWMCLK_CNTL] = 0x5A000016;        // Source=PLLD and enable
        udelay(100);
        pwm_reg[PWM_RNG1] = pulse_width_incr_us * 10;
        udelay(10);
        pwm_reg[PWM_DMAC] = PWMDMAC_ENAB | PWMDMAC_THRSHLD;
        udelay(10);
        pwm_reg[PWM_CTL] = PWMCTL_CLRF;
        udelay(10);
        pwm_reg[PWM_CTL] = PWMCTL_USEF1 | PWMCTL_PWEN1;
        udelay(10);
    }

    _is_setup = 1;

// Setup channel
//rtapi_print_msg(RTAPI_MSG_ERR, "rpi_stepgen: init DMA step channel\n");
init_channel(channel, subcycle_time_us);



   for (n = 0; n < MAX_CHAN && step_type[n] != -1 ; n++) {

// set gpio pins to output
#define set_gpio(a)	gpio_reg[GPIO_CLR0] = 1 << a; gpio_set_mode(a, GPIO_MODE_OUT); gpio_setup |= 1 << a;

//step types 0,1,2 two phases
//3,4 three phases
//5,6,7,8,9,10 four phases
// 11,12,13,14 five phases
if(step_type[n]==0 || step_type[n]==1 || step_type[n]==2)
{
int p=0;
uint32_t tt=step_pins[pincount++];
set_gpio(tt); pinlookup [n][p++]=tt;
tt=step_pins[pincount++];
set_gpio(tt); pinlookup [n][p++]=tt;

}

rtapi_print_msg(RTAPI_MSG_ERR,"rpi_stepgen: stepping type '%i', axis %i pins (%i,%i)\n",step_type[n], n, pinlookup [n][0], pinlookup [n][1]);
	}
}

#endif



//rtapi_print_msg(RTAPI_MSG_ERR,"rpi_stepgen: numsamp %u  \n",channels[0].num_samples);

    rtapi_print_msg(RTAPI_MSG_INFO,"rpi_stepgen: installed %d step pulse generators\n", num_chan);
    hal_ready(comp_id);
    return 0;
}

void rtapi_app_exit(void)
{
    // All done
    shutdown();
    hal_exit(comp_id);
}

/***********************************************************************
*              REALTIME STEP PULSE GENERATION FUNCTIONS                *
************************************************************************/

/** The frequency generator works by adding a signed value proportional
    to frequency to an accumulator.  When bit PICKOFF of the accumulator
    toggles, a step is generated.
*/




static void make_pulses(void *arg, long nperiod)
{
	int channel=0;
	rpi_stepgen_t *rpi_stepgen;
	long old_addval, target_addval, new_addval, step_now;
	int n, p;
	unsigned char outbits;
	int i;
	uint32_t phys_gpclr0 = 0x7e200000 + 0x28;
	uint32_t phys_gpset0 = 0x7e200000 + 0x1c;

/// reset the period to what the dma period should be...
	long period=pulse_width_incr_us *1000;  // convert us to ns

///// important we want to make sure the buffer is synchronised
// should check dma is working on correct half of buffer 
// but here we assume it is and just toggle first or 
// second half of buffer to write into i.e. invert it

	current_buffer= ~current_buffer;
	int step_buffer=channels[channel].num_samples/2; 
	dma_cb_t *cbp = (dma_cb_t *) get_cb(channel);  // + (2* buffer_len)
	uint32_t *dp = (uint32_t *) channels[channel].virtbase + (step_buffer * current_buffer);
	uint32_t bit_pattern=0;

//  we want to create a buffer filled with the step data for one servo period

//subcycle_time_us/pulse_width_incr_us;

  for (i = 0; i < step_buffer ; i++) {
bit_pattern=0;

    /* store period so scaling constants can be (re)calculated */
    periodns = period;
    /* point to rpi_stepgen data structures */
    rpi_stepgen = arg;


    for (n = 0; n < num_chan; n++) {
	/* decrement "timing constraint" timers */
	if ( rpi_stepgen->timer1 > 0 ) {
	    if ( rpi_stepgen->timer1 > periodns ) {
		rpi_stepgen->timer1 -= periodns;
	    } else {
		rpi_stepgen->timer1 = 0;
	    }
	}
	if ( rpi_stepgen->timer2 > 0 ) {
	    if ( rpi_stepgen->timer2 > periodns ) {
		rpi_stepgen->timer2 -= periodns;
	    } else {
		rpi_stepgen->timer2 = 0;
	    }
	}
	if ( rpi_stepgen->timer3 > 0 ) {
	    if ( rpi_stepgen->timer3 > periodns ) {
		rpi_stepgen->timer3 -= periodns;
	    } else {
		rpi_stepgen->timer3 = 0;
		/* last timer timed out, cancel hold */
		rpi_stepgen->hold_dds = 0;
	    }
	}
	if ( !rpi_stepgen->hold_dds && *(rpi_stepgen->enable) ) {
	    /* update addval (ramping) */
	    old_addval = rpi_stepgen->addval;
	    target_addval = rpi_stepgen->target_addval;
	    if (rpi_stepgen->deltalim != 0) {
		/* implement accel/decel limit */
		if (target_addval > (old_addval + rpi_stepgen->deltalim)) {
		    /* new value is too high, increase addval as far as possible */
		    new_addval = old_addval + rpi_stepgen->deltalim;
		} else if (target_addval < (old_addval - rpi_stepgen->deltalim)) {
		    /* new value is too low, decrease addval as far as possible */
		    new_addval = old_addval - rpi_stepgen->deltalim;
		} else {
		    /* new value can be reached in one step - do it */
		    new_addval = target_addval;
		}
	    } else {
		/* go to new freq without any ramping */
		new_addval = target_addval;
	    }
	    /* save result */
	    rpi_stepgen->addval = new_addval;
	    /* check for direction reversal */
	    if (((new_addval >= 0) && (old_addval < 0)) ||
		((new_addval < 0) && (old_addval >= 0))) {
		/* reversal required, can we do so now? */
		if ( rpi_stepgen->timer3 != 0 ) {
		    /* no - hold everything until delays time out */
		    rpi_stepgen->hold_dds = 1;
		}
	    }
	}
	/* update DDS */
	if ( !rpi_stepgen->hold_dds && *(rpi_stepgen->enable) ) {
	    /* save current value of low half of accum */
	    step_now = rpi_stepgen->accum;
	    /* update the accumulator */
	    rpi_stepgen->accum += rpi_stepgen->addval;
	    /* test for changes in low half of accum */
	    step_now ^= rpi_stepgen->accum;
	    /* we only care about the pickoff bit */
	    step_now &= (1L << PICKOFF);
	    /* update rawcounts parameter */
	    rpi_stepgen->rawcount = rpi_stepgen->accum >> PICKOFF;
	} else {
	    /* DDS is in hold, no steps */
	    step_now = 0;
	}
	if ( rpi_stepgen->timer2 == 0 ) {
	    /* update direction - do not change if addval = 0 */
	    if ( rpi_stepgen->addval > 0 ) {
		rpi_stepgen->curr_dir = 1;
	    } else if ( rpi_stepgen->addval < 0 ) {
		rpi_stepgen->curr_dir = -1;
	    }
	}
	if ( step_now ) {
	    /* (re)start various timers */
	    /* timer 1 = time till end of step pulse */
	    rpi_stepgen->timer1 = rpi_stepgen->step_len;
	    /* timer 2 = time till allowed to change dir pin */
	    rpi_stepgen->timer2 = rpi_stepgen->timer1 + rpi_stepgen->dir_hold_dly;
	    /* timer 3 = time till allowed to step the other way */
	    rpi_stepgen->timer3 = rpi_stepgen->timer2 + rpi_stepgen->dir_setup;
	    if ( rpi_stepgen->step_type >= 2 ) {
		/* update state */
		rpi_stepgen->state += rpi_stepgen->curr_dir;
		if ( rpi_stepgen->state < 0 ) {
		    rpi_stepgen->state = rpi_stepgen->cycle_max;
		} else if ( rpi_stepgen->state > rpi_stepgen->cycle_max ) {
		    rpi_stepgen->state = 0;
		}
	    }
	}




//return;











#if 1
//#define set_pin1(pin )	 *(rpi_stepgen->phase[STEP_PIN]) = 1;
//#define set_pin0(pin )	 *(rpi_stepgen->phase[DIR_PIN]) = 0;

#define set_pin0(pin  ) bit_pattern=pinlookup [n][pin]; *(dp + i) &= ~(1 << bit_pattern);  // set 0
#define set_pin1(pin  ) bit_pattern=pinlookup [n][pin]; *(dp + i) |= (1 << bit_pattern); // set 1

//#define set_pin1(pin )	 ;
//#define set_pin0(pin )	;

	/* generate output, based on stepping type */
	if (rpi_stepgen->step_type == 0) {
	    /* step/dir output */
	    if ( rpi_stepgen->timer1 != 0 ) {
		set_pin1(STEP_PIN );
	    } else {
		set_pin0(STEP_PIN );
	    }
	    if ( rpi_stepgen->curr_dir < 0 ) {
set_pin1(DIR_PIN );
	//	 *(rpi_stepgen->phase[DIR_PIN]) = 1;
	    } else {
set_pin0(DIR_PIN );
	//	 *(rpi_stepgen->phase[DIR_PIN]) = 0;
	    }

	} else if (rpi_stepgen->step_type == 1) {
	    /* up/down */
	    if ( rpi_stepgen->timer1 != 0 ) {
		if ( rpi_stepgen->curr_dir < 0 ) {
set_pin0(UP_PIN );
set_pin1(DOWN_PIN );

		 //   *(rpi_stepgen->phase[UP_PIN]) = 0;
		 //   *(rpi_stepgen->phase[DOWN_PIN]) = 1;
		} else {
set_pin1(UP_PIN );
set_pin0(DOWN_PIN );
		 //   *(rpi_stepgen->phase[UP_PIN]) = 1;
		 //   *(rpi_stepgen->phase[DOWN_PIN]) = 0;
		}
	    } else {
set_pin0(UP_PIN );
set_pin0(DOWN_PIN );
	//	*(rpi_stepgen->phase[UP_PIN]) = 0;
	//	*(rpi_stepgen->phase[DOWN_PIN]) = 0;
	    }
	} 

#if 0
else {
	    /* step type 2 or greater */
	    /* look up correct output pattern */
	    outbits = (rpi_stepgen->lut)[rpi_stepgen->state];
	    /* now output the phase bits */
	    for (p = 0; p < rpi_stepgen->num_phases; p++) {
		/* output one phase */
		*(rpi_stepgen->phase[p]) = outbits & 1;
		/* move to the next phase */
		outbits >>= 1;
	    }
	}
#endif

#endif

	/* move on to next step generator */
	rpi_stepgen++;
    }
    /* done */
    }

}


static void update_pos(void *arg, long period)
{
    long long int accum_a, accum_b;
    rpi_stepgen_t *rpi_stepgen;
    int n;

    rpi_stepgen = arg;

    for (n = 0; n < num_chan; n++) {
	/* 'accum' is a long long, and its remotely possible that
	   make_pulses could change it half-way through a read.
	   So we have a crude atomic read routine */
	do {
	    accum_a = rpi_stepgen->accum;
	    accum_b = rpi_stepgen->accum;
	} while ( accum_a != accum_b );
	/* compute integer counts */
	*(rpi_stepgen->count) = accum_a >> PICKOFF;
	/* check for change in scale value */
	if (rpi_stepgen->pos_scale != rpi_stepgen->old_scale) {
	    /* get ready to detect future scale changes */
	    rpi_stepgen->old_scale = rpi_stepgen->pos_scale;
	    /* validate the new scale value */
	    if ((rpi_stepgen->pos_scale < 1e-20)
		&& (rpi_stepgen->pos_scale > -1e-20)) {
		/* value too small, divide by zero is a bad thing */
		rpi_stepgen->pos_scale = 1.0;
	    }
	    /* we will need the reciprocal, and the accum is fixed point with
	       fractional bits, so we precalc some stuff */
	    rpi_stepgen->scale_recip = (1.0 / (1L << PICKOFF)) / rpi_stepgen->pos_scale;
	}
	/* scale accumulator to make floating point position, after
	   removing the one-half count offset */
	*(rpi_stepgen->pos_fb) = (double)(accum_a-(1<< (PICKOFF-1))) * rpi_stepgen->scale_recip;
	/* move on to next channel */
	rpi_stepgen++;
    }
    /* done */
}

/* helper function - computes integeral multiple of increment that is greater
   or equal to value */
static unsigned long ulceil(unsigned long value, unsigned long increment)
{
    if ( value == 0 ) {
	return 0;
    }
    return increment*(((value-1)/increment)+1);
}

static void update_freq(void *arg, long period)
{
    rpi_stepgen_t *rpi_stepgen;
    int n, newperiod;
    long min_step_period;
    long long int accum_a, accum_b;
    double pos_cmd, vel_cmd, curr_pos, curr_vel, avg_v, max_freq, max_ac;
    double match_ac, match_time, est_out, est_cmd, est_err, dp, dv, new_vel;
    double desired_freq;
    /*! \todo FIXME - while this code works just fine, there are a bunch of
       internal variables, many of which hold intermediate results that
       don't really need their own variables.  They are used either for
       clarity, or because that's how the code evolved.  This algorithm
       could use some cleanup and optimization. */
    /* this periodns stuff is a little convoluted because we need to
       calculate some constants here in this relatively slow thread but the
       constants are based on the period of the much faster 'make_pulses()'
       thread. */
    /* only recalc constants if period changes */
    newperiod = 0;
    if (periodns != old_periodns) {
	/* get ready to detect future period changes */
	old_periodns = periodns;
	/* recompute various constants that depend on periodns */
	periodfp = periodns * 0.000000001;
	freqscale = (1L << PICKOFF) * periodfp;
	accelscale = freqscale * periodfp;
	/* force re-evaluation of the timing parameters */
	newperiod = 1;
    }
    /* now recalc constants related to the period of this funct */
    /* only recalc constants if period changes */
    if (period != old_dtns) {
	/* get ready to detect future period changes */
	old_dtns = period;
	/* dT is the period of this thread, used for the position loop */
	dt = period * 0.000000001;
	/* calc the reciprocal once here, to avoid multiple divides later */
	recip_dt = 1.0 / dt;
    }

    /* point at rpi_stepgen data */
    rpi_stepgen = arg;

    /* loop thru generators */
    for (n = 0; n < num_chan; n++) {
	/* check for scale change */
	if (rpi_stepgen->pos_scale != rpi_stepgen->old_scale) {
	    /* get ready to detect future scale changes */
	    rpi_stepgen->old_scale = rpi_stepgen->pos_scale;
	    /* validate the new scale value */
	    if ((rpi_stepgen->pos_scale < 1e-20)
		&& (rpi_stepgen->pos_scale > -1e-20)) {
		/* value too small, divide by zero is a bad thing */
		rpi_stepgen->pos_scale = 1.0;
	    }
	    /* we will need the reciprocal, and the accum is fixed point with
	       fractional bits, so we precalc some stuff */
	    rpi_stepgen->scale_recip = (1.0 / (1L << PICKOFF)) / rpi_stepgen->pos_scale;
	}
	if ( newperiod ) {
	    /* period changed, force recalc of timing parameters */
	    rpi_stepgen->old_step_len = ~0;
	    rpi_stepgen->old_step_space = ~0;
	    rpi_stepgen->old_dir_hold_dly = ~0;
	    rpi_stepgen->old_dir_setup = ~0;
	}
	/* process timing parameters */
	if ( rpi_stepgen->step_len != rpi_stepgen->old_step_len ) {
	    /* must be non-zero */
	    if ( rpi_stepgen->step_len == 0 ) {
		rpi_stepgen->step_len = 1;
	    }
	    /* make integer multiple of periodns */
	    rpi_stepgen->old_step_len = ulceil(rpi_stepgen->step_len, periodns);
	    rpi_stepgen->step_len = rpi_stepgen->old_step_len;
	}
	if ( rpi_stepgen->step_space != rpi_stepgen->old_step_space ) {
	    /* make integer multiple of periodns */
	    rpi_stepgen->old_step_space = ulceil(rpi_stepgen->step_space, periodns);
	    rpi_stepgen->step_space = rpi_stepgen->old_step_space;
	}
	if ( rpi_stepgen->dir_setup != rpi_stepgen->old_dir_setup ) {
	    /* make integer multiple of periodns */
	    rpi_stepgen->old_dir_setup = ulceil(rpi_stepgen->dir_setup, periodns);
	    rpi_stepgen->dir_setup = rpi_stepgen->old_dir_setup;
	}
	if ( rpi_stepgen->dir_hold_dly != rpi_stepgen->old_dir_hold_dly ) {
	    if ( (rpi_stepgen->dir_hold_dly + rpi_stepgen->dir_setup) == 0 ) {
		/* dirdelay must be non-zero step types 0 and 1 */
		if ( rpi_stepgen->step_type < 2 ) {
		    rpi_stepgen->dir_hold_dly = 1;
		}
	    }
	    rpi_stepgen->old_dir_hold_dly = ulceil(rpi_stepgen->dir_hold_dly, periodns);
	    rpi_stepgen->dir_hold_dly = rpi_stepgen->old_dir_hold_dly;
	}
	/* test for disabled rpi_stepgen */
	if (*rpi_stepgen->enable == 0) {
	    /* disabled: keep updating old_pos_cmd (if in pos ctrl mode) */
	    if ( rpi_stepgen->pos_mode ) {
		rpi_stepgen->old_pos_cmd = *rpi_stepgen->pos_cmd * rpi_stepgen->pos_scale;
	    }
	    /* set velocity to zero */
	    rpi_stepgen->freq = 0;
	    rpi_stepgen->addval = 0;
	    rpi_stepgen->target_addval = 0;
	    /* and skip to next one */
	    rpi_stepgen++;
	    continue;
	}
	/* calculate frequency limit */
	min_step_period = rpi_stepgen->step_len + rpi_stepgen->step_space;
	max_freq = 1.0 / (min_step_period * 0.000000001);
	/* check for user specified frequency limit parameter */
	if (rpi_stepgen->maxvel <= 0.0) {
	    /* set to zero if negative */
	    rpi_stepgen->maxvel = 0.0;
	} else {
	    /* parameter is non-zero, compare to max_freq */
	    desired_freq = rpi_stepgen->maxvel * fabs(rpi_stepgen->pos_scale);
	    if (desired_freq > max_freq) {
		/* parameter is too high, complain about it */
		if(!rpi_stepgen->printed_error) {
		    rtapi_print_msg(RTAPI_MSG_ERR,
			"rpi_stepgen: Channel %d: The requested maximum velocity of %d steps/sec is too high.\n",
			n, (int)desired_freq);
		    rtapi_print_msg(RTAPI_MSG_ERR,
			"rpi_stepgen: The maximum possible frequency is %d steps/second\n",
			(int)max_freq);
		    rpi_stepgen->printed_error = 1;
		}
		/* parameter is too high, limit it */
		rpi_stepgen->maxvel = max_freq / fabs(rpi_stepgen->pos_scale);
	    } else {
		/* lower max_freq to match parameter */
		max_freq = rpi_stepgen->maxvel * fabs(rpi_stepgen->pos_scale);
	    }
	}
	/* set internal accel limit to its absolute max, which is
	   zero to full speed in one thread period */
	max_ac = max_freq * recip_dt;
	/* check for user specified accel limit parameter */
	if (rpi_stepgen->maxaccel <= 0.0) {
	    /* set to zero if negative */
	    rpi_stepgen->maxaccel = 0.0;
	} else {
	    /* parameter is non-zero, compare to max_ac */
	    if ((rpi_stepgen->maxaccel * fabs(rpi_stepgen->pos_scale)) > max_ac) {
		/* parameter is too high, lower it */
		rpi_stepgen->maxaccel = max_ac / fabs(rpi_stepgen->pos_scale);
	    } else {
		/* lower limit to match parameter */
		max_ac = rpi_stepgen->maxaccel * fabs(rpi_stepgen->pos_scale);
	    }
	}
	/* at this point, all scaling, limits, and other parameter
	   changes have been handled - time for the main control */
	if ( rpi_stepgen->pos_mode ) {
	    /* calculate position command in counts */
	    pos_cmd = *rpi_stepgen->pos_cmd * rpi_stepgen->pos_scale;
	    /* calculate velocity command in counts/sec */
	    vel_cmd = (pos_cmd - rpi_stepgen->old_pos_cmd) * recip_dt;
	    rpi_stepgen->old_pos_cmd = pos_cmd;
	    /* 'accum' is a long long, and its remotely possible that
	       make_pulses could change it half-way through a read.
	       So we have a crude atomic read routine */
	    do {
		accum_a = rpi_stepgen->accum;
		accum_b = rpi_stepgen->accum;
	    } while ( accum_a != accum_b );
	    /* convert from fixed point to double, after subtracting
	       the one-half step offset */
	    curr_pos = (accum_a-(1<< (PICKOFF-1))) * (1.0 / (1L << PICKOFF));
	    /* get velocity in counts/sec */
	    curr_vel = rpi_stepgen->freq;
	    /* At this point we have good values for pos_cmd, curr_pos,
	       vel_cmd, curr_vel, max_freq and max_ac, all in counts,
	       counts/sec, or counts/sec^2.  Now we just have to do
	       something useful with them. */
	    /* determine which way we need to ramp to match velocity */
	    if (vel_cmd > curr_vel) {
		match_ac = max_ac;
	    } else {
		match_ac = -max_ac;
	    }
	    /* determine how long the match would take */
	    match_time = (vel_cmd - curr_vel) / match_ac;
	    /* calc output position at the end of the match */
	    avg_v = (vel_cmd + curr_vel) * 0.5;
	    est_out = curr_pos + avg_v * match_time;
	    /* calculate the expected command position at that time */
	    est_cmd = pos_cmd + vel_cmd * (match_time - 1.5 * dt);
	    /* calculate error at that time */
	    est_err = est_out - est_cmd;
	    if (match_time < dt) {
		/* we can match velocity in one period */
		if (fabs(est_err) < 0.0001) {
		    /* after match the position error will be acceptable */
		    /* so we just do the velocity match */
		    new_vel = vel_cmd;
		} else {
		    /* try to correct position error */
		    new_vel = vel_cmd - 0.5 * est_err * recip_dt;
		    /* apply accel limits */
		    if (new_vel > (curr_vel + max_ac * dt)) {
			new_vel = curr_vel + max_ac * dt;
		    } else if (new_vel < (curr_vel - max_ac * dt)) {
			new_vel = curr_vel - max_ac * dt;
		    }
		}
	    } else {
		/* calculate change in final position if we ramp in the
		   opposite direction for one period */
		dv = -2.0 * match_ac * dt;
		dp = dv * match_time;
		/* decide which way to ramp */
		if (fabs(est_err + dp * 2.0) < fabs(est_err)) {
		    match_ac = -match_ac;
		}
		/* and do it */
		new_vel = curr_vel + match_ac * dt;
	    }
	    /* apply frequency limit */
	    if (new_vel > max_freq) {
		new_vel = max_freq;
	    } else if (new_vel < -max_freq) {
		new_vel = -max_freq;
	    }
	    /* end of position mode */
	} else {
	    /* velocity mode is simpler */
	    /* calculate velocity command in counts/sec */
	    vel_cmd = *(rpi_stepgen->vel_cmd) * rpi_stepgen->pos_scale;
	    /* apply frequency limit */
	    if (vel_cmd > max_freq) {
		vel_cmd = max_freq;
	    } else if (vel_cmd < -max_freq) {
		vel_cmd = -max_freq;
	    }
	    /* calc max change in frequency in one period */
	    dv = max_ac * dt;
	    /* apply accel limit */
	    if ( vel_cmd > (rpi_stepgen->freq + dv) ) {
		new_vel = rpi_stepgen->freq + dv;
	    } else if ( vel_cmd < (rpi_stepgen->freq - dv) ) {
		new_vel = rpi_stepgen->freq - dv;
	    } else {
		new_vel = vel_cmd;
	    }
	    /* end of velocity mode */
	}
	rpi_stepgen->freq = new_vel;
	/* calculate new addval */
	rpi_stepgen->target_addval = rpi_stepgen->freq * freqscale;
	/* calculate new deltalim */
	rpi_stepgen->deltalim = max_ac * accelscale;
	/* move on to next channel */
	rpi_stepgen++;
    }
    /* done */
}

/***********************************************************************
*                   LOCAL FUNCTION DEFINITIONS                         *
************************************************************************/

static int export_rpi_stepgen(int num, rpi_stepgen_t * addr, int step_type, int pos_mode)
{
    int n, retval, msg;

    /* This function exports a lot of stuff, which results in a lot of
       logging if msg_level is at INFO or ALL. So we save the current value
       of msg_level and restore it later.  If you actually need to log this
       function's actions, change the second line below */
    msg = rtapi_get_msg_level();
    rtapi_set_msg_level(RTAPI_MSG_WARN);

    /* export param variable for raw counts */
    retval = hal_param_s32_newf(HAL_RO, &(addr->rawcount), comp_id,
	"rpi_stepgen.%d.rawcounts", num);
    if (retval != 0) { return retval; }
    /* export pin for counts captured by update() */
    retval = hal_pin_s32_newf(HAL_OUT, &(addr->count), comp_id,
	"rpi_stepgen.%d.counts", num);
    if (retval != 0) { return retval; }
    /* export parameter for position scaling */
    retval = hal_param_float_newf(HAL_RW, &(addr->pos_scale), comp_id,
	"rpi_stepgen.%d.position-scale", num);
    if (retval != 0) { return retval; }
    /* export pin for command */
    if ( pos_mode ) {
	retval = hal_pin_float_newf(HAL_IN, &(addr->pos_cmd), comp_id,
	    "rpi_stepgen.%d.position-cmd", num);
    } else {
	retval = hal_pin_float_newf(HAL_IN, &(addr->vel_cmd), comp_id,
	    "rpi_stepgen.%d.velocity-cmd", num);
    }
    if (retval != 0) { return retval; }
    /* export pin for enable command */
    retval = hal_pin_bit_newf(HAL_IN, &(addr->enable), comp_id,
	"rpi_stepgen.%d.enable", num);
    if (retval != 0) { return retval; }
    /* export pin for scaled position captured by update() */
    retval = hal_pin_float_newf(HAL_OUT, &(addr->pos_fb), comp_id,
	"rpi_stepgen.%d.position-fb", num);
    if (retval != 0) { return retval; }
    /* export param for scaled velocity (frequency in Hz) */
    retval = hal_param_float_newf(HAL_RO, &(addr->freq), comp_id,
	"rpi_stepgen.%d.frequency", num);
    if (retval != 0) { return retval; }
    /* export parameter for max frequency */
    retval = hal_param_float_newf(HAL_RW, &(addr->maxvel), comp_id,
	"rpi_stepgen.%d.maxvel", num);
    if (retval != 0) { return retval; }
    /* export parameter for max accel/decel */
    retval = hal_param_float_newf(HAL_RW, &(addr->maxaccel), comp_id,
	"rpi_stepgen.%d.maxaccel", num);
    if (retval != 0) { return retval; }
    /* every step type uses steplen */
    retval = hal_param_u32_newf(HAL_RW, &(addr->step_len), comp_id,
	"rpi_stepgen.%d.steplen", num);
    if (retval != 0) { return retval; }


//////////////////////////////////////////////////////////////////////////////////////
/////////



    if (step_type < 2) {
	/* step/dir and up/down use 'stepspace' */
	retval = hal_param_u32_newf(HAL_RW, &(addr->step_space),
	    comp_id, "rpi_stepgen.%d.stepspace", num);
	if (retval != 0) { return retval; }
    }
    if ( step_type == 0 ) {
	/* step/dir is the only one that uses dirsetup and dirhold */
	retval = hal_param_u32_newf(HAL_RW, &(addr->dir_setup),
	    comp_id, "rpi_stepgen.%d.dirsetup", num);
	if (retval != 0) { return retval; }
	retval = hal_param_u32_newf(HAL_RW, &(addr->dir_hold_dly),
	    comp_id, "rpi_stepgen.%d.dirhold", num);
	if (retval != 0) { return retval; }
    } else {
	/* the others use dirdelay */
	retval = hal_param_u32_newf(HAL_RW, &(addr->dir_hold_dly),
	    comp_id, "rpi_stepgen.%d.dirdelay", num);
	if (retval != 0) { return retval; }
    }
    /* export output pins */


#if 0
    if ( step_type == 0 ) {
	/* step and direction */
	retval = hal_pin_bit_newf(HAL_OUT, &(addr->phase[STEP_PIN]),
	    comp_id, "rpi_stepgen.%d.step", num);
	if (retval != 0) { return retval; }
	*(addr->phase[STEP_PIN]) = 0;
	retval = hal_pin_bit_newf(HAL_OUT, &(addr->phase[DIR_PIN]),
	    comp_id, "rpi_stepgen.%d.dir", num);
	if (retval != 0) { return retval; }
	*(addr->phase[DIR_PIN]) = 0;
    } else if (step_type == 1) {
	/* up and down */
	retval = hal_pin_bit_newf(HAL_OUT, &(addr->phase[UP_PIN]),
	    comp_id, "rpi_stepgen.%d.up", num);
	if (retval != 0) { return retval; }
	*(addr->phase[UP_PIN]) = 0;
	retval = hal_pin_bit_newf(HAL_OUT, &(addr->phase[DOWN_PIN]),
	    comp_id, "rpi_stepgen.%d.down", num);
	if (retval != 0) { return retval; }
	*(addr->phase[DOWN_PIN]) = 0;
    } else {
	/* stepping types 2 and higher use a varying number of phase pins */
	addr->num_phases = num_phases_lut[step_type - 2];
	for (n = 0; n < addr->num_phases; n++) {
	    retval = hal_pin_bit_newf(HAL_OUT, &(addr->phase[n]),
		comp_id, "rpi_stepgen.%d.phase-%c", num, n + 'A');
	    if (retval != 0) { return retval; }
	    *(addr->phase[n]) = 0;
	}
    }
#endif

    /* set default parameter values */
    addr->pos_scale = 1.0;
    addr->old_scale = 0.0;
    addr->scale_recip = 0.0;
    addr->freq = 0.0;
    addr->maxvel = 0.0;
    addr->maxaccel = 0.0;
    addr->step_type = step_type;
    addr->pos_mode = pos_mode;
    /* timing parameter defaults depend on step type */
    addr->step_len = 1;
    if ( step_type < 2 ) {
	addr->step_space = 1;
    } else {
	addr->step_space = 0;
    }
    if ( step_type == 0 ) {
	addr->dir_hold_dly = 1;
	addr->dir_setup = 1;
    } else {
	addr->dir_hold_dly = 1;
	addr->dir_setup = 0;
    }
    /* set 'old' values to make update_freq validate the timing params */
    addr->old_step_len = ~0;
    addr->old_step_space = ~0;
    addr->old_dir_hold_dly = ~0;
    addr->old_dir_setup = ~0;
    if ( step_type >= 2 ) {
	/* init output stuff */
	addr->cycle_max = cycle_len_lut[step_type - 2] - 1;
	addr->lut = &(master_lut[step_type - 2][0]);
    }
    /* init the step generator core to zero output */
    addr->timer1 = 0;
    addr->timer2 = 0;
    addr->timer3 = 0;
    addr->hold_dds = 0;
    addr->addval = 0;
    /* accumulator gets a half step offset, so it will step half
       way between integer positions, not at the integer positions */
    addr->accum = 1 << (PICKOFF-1);
    addr->rawcount = 0;
    addr->curr_dir = 0;
    addr->state = 0;
    *(addr->enable) = 0;
    addr->target_addval = 0;
    addr->deltalim = 0;
    /* other init */
    addr->printed_error = 0;
    addr->old_pos_cmd = 0.0;
    /* set initial pin values */
    *(addr->count) = 0;
    *(addr->pos_fb) = 0.0;
    if ( pos_mode ) {
	*(addr->pos_cmd) = 0.0;
    } else {
	*(addr->vel_cmd) = 0.0;
    }
    /* restore saved message level */
    rtapi_set_msg_level(msg);
    return 0;
}

static int setup_user_step_type(void) {
    int used_phases = 0;
    int i = 0;
    for(i=0; i<10 && user_step_type[i] != -1; i++) {
        master_lut[USER_STEP_TYPE][i] = user_step_type[i];
	used_phases |= user_step_type[i];
    }
    cycle_len_lut[USER_STEP_TYPE] = i;
    if(used_phases & ~0x1f) {
	    rtapi_print_msg(RTAPI_MSG_ERR, "rpi_stepgen: ERROR: "
			    "bad user step type uses more than 5 phases");
	    return -EINVAL; // more than 5 phases is not allowed
    }

    if(used_phases & 0x10) num_phases_lut[USER_STEP_TYPE] = 5;
    else if(used_phases & 0x8) num_phases_lut[USER_STEP_TYPE] = 4;
    else if(used_phases & 0x4) num_phases_lut[USER_STEP_TYPE] = 3;
    else if(used_phases & 0x2) num_phases_lut[USER_STEP_TYPE] = 2;
    else if(used_phases & 0x1) num_phases_lut[USER_STEP_TYPE] = 1;

    if(used_phases)
	    rtapi_print_msg(RTAPI_MSG_INFO,
		"User step type has %d phases and %d steps per cycle\n",
		num_phases_lut[USER_STEP_TYPE], i);
    return 0;
}

static CONTROL parse_ctrl_type(const char *ctrl)
{
    if(!ctrl || !*ctrl || *ctrl == 'p' || *ctrl == 'P') return POSITION;
    if(*ctrl == 'v' || *ctrl == 'V') return VELOCITY;
    return INVALID;
}



i
Last edit: 20 Apr 2013 09:07 by mungkie.

Please Log in or Create an account to join the conversation.

More
22 Apr 2013 19:02 #32990 by andypugh

All stepgen functions will run at servo_thread period INCLUDING MAKE_PULSES!!!!!!!!!,

That seems right, as it is also the case with a Mesa stepgen.

I AM NOT SURE BUT ASSUME THAT 'period' variable that is argument of 'make_pulses' is the exact elapsed time in nanoseconds since make_pulses was last called by the RT thread can anyone confirm that??

Lots of the drivers seem to assume this. I don't think it is actually the case. If you set up your driver so that it prints the period to dmesg (probably only the first few dozen times through) you will see an uncanny regularity, or I did last time I looked.
It might be safer to keep an internal record of clock-ticks www.linuxcnc.org/docs/html/man/man3/rtapi_get_time.3rtapi.html

Please Log in or Create an account to join the conversation.

More
22 Apr 2013 21:00 #32998 by mhaberler

All stepgen functions will run at servo_thread period INCLUDING MAKE_PULSES!!!!!!!!!,

That seems right, as it is also the case with a Mesa stepgen.

Not a good analogy - the Mesa 'base thread' is on the FPGA so that path doesnt need a host-side base thread

see here: www.linuxcnc.org/docs/devel/html/man/man9/stepgen.9.html :

stepgen.make-pulses <---- this one should be on the base thread, i.e fast

stepgen.capture-position and stepgen.update-freq can be on the servo thread

you wont get far with make-pulses on the servo thread

- Michael

Please Log in or Create an account to join the conversation.

Time to create page: 0.257 seconds
Powered by Kunena Forum