Is this too much data for one buffer object?

I have been trying to improve the preformance of my opengl particle game for ios (and in the future android)

Basically the big task that openGL does is draw the particles. They are simple rounded lines with one color per line, and variable widths.

Unfortunately Opengl ES 2.0 has neither rounded lines, or lines of variable width. So each particle is drawn by constructing the following.

  1. One point representing the beginning of the line
  2. One point representing the end of the line
  3. 6 Points making a rectangle for it

And it has the following attributes to it

  1. R Component
  2. G Component
  3. B Component
  4. A Component
  5. Dot size

Right now I have been drawing everything in two stages, using two buffers for each stage.
Stage 1: Dots
1 Buffer for dot size
2 Buffer for color (Even though each line has the same color I still have one color per dot (two entries per particle of the same value))
3 Buffer for dot position (I am working mostly orthographically, so I could be fine getting rid of the z component but i’d rather keep it unless it is a huge drag.
Stage 2: Rectangles
1. Buffer for color (6 entries per particle)
2. Buffer for points drawn with GL_TRIANGLES (6 entries per particle)

Allot of the performance loss comes from making these buffers

All of this data comes from a series of arrays built earlier. This creates substantial latency as well, but it is all needed. The most expensive thing is calculating the 4 points on the rectangle based on the two end points.

Anyways:
I recently saw a post talking about interlacing buffer data, having the color, position, and whatever data all packed into one buffer and I thought this was brilliant! But im not sure exactly how far I can go with this.

I guess ideally for each point the buffer would have the data in the following way

  1. Point size
  2. Color (RGBA)
  3. End Points (2 points XYZ) (Z would only be needed once because the whole particle will be at the same z position)
  4. Points of the rectangle (XY X 6)

So it would look like
SRGBAXYZXYXYXYXYXYXYXY

Questions:

  1. Can I somehow only have one color entry per particle instead of repeating it? (even though that one color will be for 2 points in the case of stage one, and 6 points in the case of stage 2)
  2. Could I interlace size, position, AND, color into one buffer? Would this reduce the time it takes.
  3. Would I be able to merge stage one and stage 2 into one buffer? Currently they use two different shaders but I think I could change that.
  4. I read an apple thing suggesting having padding (empty spaces) between some values… why does that sort of thing help?
  5. Any other suggestions?

Edit: An idea I just had is to use structs to pass everything in. I have done it before in opengl… not sure how this goes in opengl es. Also I dont know how I would get it to draw two different shapes from one buffer.